You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(25) |
Nov
|
Dec
(22) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(13) |
Feb
(22) |
Mar
(39) |
Apr
(10) |
May
(26) |
Jun
(23) |
Jul
(38) |
Aug
(20) |
Sep
(27) |
Oct
(76) |
Nov
(32) |
Dec
(11) |
2003 |
Jan
(8) |
Feb
(23) |
Mar
(12) |
Apr
(39) |
May
(1) |
Jun
(48) |
Jul
(35) |
Aug
(15) |
Sep
(60) |
Oct
(27) |
Nov
(9) |
Dec
(32) |
2004 |
Jan
(8) |
Feb
(16) |
Mar
(40) |
Apr
(25) |
May
(12) |
Jun
(33) |
Jul
(49) |
Aug
(39) |
Sep
(26) |
Oct
(47) |
Nov
(26) |
Dec
(36) |
2005 |
Jan
(29) |
Feb
(15) |
Mar
(22) |
Apr
(1) |
May
(8) |
Jun
(32) |
Jul
(11) |
Aug
(17) |
Sep
(9) |
Oct
(7) |
Nov
(15) |
Dec
|
From: Nicholas H. <he...@se...> - 2002-10-22 19:33:21
|
On Fri, 18 Oct 2002 12:47:33 -0400 er...@he... wrote: -- is this normal, or is it 'fixable'? > > It's normal and fixable. The issue is that the cruddy I/O forwarding > hack only works for processes created directly from the front end. In > the tree spawn with vrfork() only rank 0 is created directly from the > front end. Gotcha -- I am working on that now -- using bpsh's base functions. > > The fix would be to add a bpsh-like I/O forwarder to mpirun and do a > similar I/O thing. > > - Erik Here is a patch to catch an error of /etc/beowulf/nodeinfo is missing: bash-2.05$ diff -urN /home/henken/src/bproc/mpirun-0.2/mpirun.c mpirun.c --- /home/henken/src/bproc/mpirun-0.2/mpirun.c Tue Oct 22 15:33:06 2002 +++ mpirun.c Tue Oct 22 15:30:26 2002 @@ -101,7 +101,11 @@ list_size = bproc_numnodes(); - f = fopen(dbfile, "r"); + if ((f = fopen(dbfile, "r")) == NULL) { + fprintf(stderr,"could not open %s\n", dbfile); + perror("fopen"); + exit(1); + } lk.l_type = F_RDLCK; lk.l_whence = SEEK_SET; lk.l_start = 0; And: shouldn't bproc_vexecmove_io fail of prog points a nonexistant program? cheers Nic -- Nicholas Henke Linux Cluster Systems Programmer he...@se... - 215.573.8149 |
From: <er...@he...> - 2002-10-21 19:26:25
|
On Mon, Oct 21, 2002 at 10:06:44AM -0400, Nicholas Henke wrote: > Hrm -- I seems to be breaking things again. The following oops occurred > running my 'noop' script again -- the ps script is that script, just > remove the bpsh $node ps line. I have seen this oops twice so far -- the > Code: was the same in both. If the trace is bogus, I would _really_ > appreciate any pointers you could give me to get better traces for you. > > Cheers! > Nic > > Unable to handle kernel paging request at virtual address 0804a59b > *pde = 1bde1067 > Oops: 0003 > CPU: 1 > EIP: [<c0116c6c>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010046 > Process bpslave (pid: 31698, stackpage=d8a45000) > Stack: 00000001 00000282 00000003 dca85ef0 dc488cc0 00000000 dbd9800 > c010608c > dca85ef0 dc40bfa0 00000000 e097af39 00000000 00000000 00000000 00000000 > 00000000 > 00000000 00000000 00000000 00000000 00000000 00000000 00000011 > Call Trace: [<c010608c>] [<e097af1d>] [<e096f329>] [<e097a698>] > [<e096f313>] > [<e097a698>] > Code: c7 01 00 00 00 00 8b 41 3c 85 c0 75 2d a1 c0 d1 26 c0 8d 51 > > >>EIP; c0116c6c <__wake_up+4c/c0> <===== > Trace; c010608c <__up_wakeup+8/c> > Trace; e097af1d <__module_using_checksums+5893/????> > Trace; e096f329 <[bproc]bproc_iod_release+3d/89> > Trace; e097a698 <__module_using_checksums+500e/????> > Trace; e096f313 <[bproc]bproc_iod_release+27/89> > Trace; e097a698 <__module_using_checksums+500e/????> Hrm. It's hard to give generic pointers. I usually try to look at the whole mess and try to figure it out. This back trace looks reasonable except for the __module_using_checksums part. That's weird. A lot of the time I get the module binary and look for the code in it to see what function it's in. In this case you're in __wake_up so that's not that useful. As always reproducing it is the most reliable way to go. I just saw some more weirdness so I'm looking into it. - Erik |
From: Wilton W. <ww...@ha...> - 2002-10-21 15:54:08
|
I think it may be the same problem.. on first glance.. the patch that was posted appears only to make the window for error smaller.. I think you need to do some locking on it.. I'll check ;) - Wilton On Mon, 21 Oct 2002, Nicholas Henke wrote: > Hrm -- I seems to be breaking things again. The following oops occurred > running my 'noop' script again -- the ps script is that script, just > remove the bpsh $node ps line. I have seen this oops twice so far -- the > Code: was the same in both. If the trace is bogus, I would _really_ > appreciate any pointers you could give me to get better traces for you. > > Cheers! > Nic > > Unable to handle kernel paging request at virtual address 0804a59b > *pde = 1bde1067 > Oops: 0003 > CPU: 1 > EIP: [<c0116c6c>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010046 > Process bpslave (pid: 31698, stackpage=d8a45000) > Stack: 00000001 00000282 00000003 dca85ef0 dc488cc0 00000000 dbd9800 > c010608c > dca85ef0 dc40bfa0 00000000 e097af39 00000000 00000000 00000000 00000000 > 00000000 > 00000000 00000000 00000000 00000000 00000000 00000000 00000011 > Call Trace: [<c010608c>] [<e097af1d>] [<e096f329>] [<e097a698>] > [<e096f313>] > [<e097a698>] > Code: c7 01 00 00 00 00 8b 41 3c 85 c0 75 2d a1 c0 d1 26 c0 8d 51 > > >>EIP; c0116c6c <__wake_up+4c/c0> <===== > Trace; c010608c <__up_wakeup+8/c> > Trace; e097af1d <__module_using_checksums+5893/????> > Trace; e096f329 <[bproc]bproc_iod_release+3d/89> > Trace; e097a698 <__module_using_checksums+500e/????> > Trace; e096f313 <[bproc]bproc_iod_release+27/89> > Trace; e097a698 <__module_using_checksums+500e/????> > Code; c0116c6c <__wake_up+4c/c0> > 00000000 <_EIP>: > Code; c0116c6c <__wake_up+4c/c0> <===== > 0: c7 01 00 00 00 00 movl $0x0,(%ecx) <===== > Code; c0116c72 <__wake_up+52/c0> > 6: 8b 41 3c mov 0x3c(%ecx),%eax > Code; c0116c75 <__wake_up+55/c0> > 9: 85 c0 test %eax,%eax > Code; c0116c77 <__wake_up+57/c0> > b: 75 2d jne 3a <_EIP+0x3a> c0116ca6 > <__wake_up+86/c0> > Code; c0116c79 <__wake_up+59/c0> > d: a1 c0 d1 26 c0 mov 0xc026d1c0,%eax > Code; c0116c7e <__wake_up+5e/c0> > 12: 8d 51 00 lea 0x0(%ecx),%edx > > > -- > Nicholas Henke > Linux Cluster Systems Programmer > he...@se... - 215.573.8149 > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: Nicholas H. <he...@se...> - 2002-10-21 14:07:06
|
Hrm -- I seems to be breaking things again. The following oops occurred running my 'noop' script again -- the ps script is that script, just remove the bpsh $node ps line. I have seen this oops twice so far -- the Code: was the same in both. If the trace is bogus, I would _really_ appreciate any pointers you could give me to get better traces for you. Cheers! Nic Unable to handle kernel paging request at virtual address 0804a59b *pde = 1bde1067 Oops: 0003 CPU: 1 EIP: [<c0116c6c>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010046 Process bpslave (pid: 31698, stackpage=d8a45000) Stack: 00000001 00000282 00000003 dca85ef0 dc488cc0 00000000 dbd9800 c010608c dca85ef0 dc40bfa0 00000000 e097af39 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000011 Call Trace: [<c010608c>] [<e097af1d>] [<e096f329>] [<e097a698>] [<e096f313>] [<e097a698>] Code: c7 01 00 00 00 00 8b 41 3c 85 c0 75 2d a1 c0 d1 26 c0 8d 51 >>EIP; c0116c6c <__wake_up+4c/c0> <===== Trace; c010608c <__up_wakeup+8/c> Trace; e097af1d <__module_using_checksums+5893/????> Trace; e096f329 <[bproc]bproc_iod_release+3d/89> Trace; e097a698 <__module_using_checksums+500e/????> Trace; e096f313 <[bproc]bproc_iod_release+27/89> Trace; e097a698 <__module_using_checksums+500e/????> Code; c0116c6c <__wake_up+4c/c0> 00000000 <_EIP>: Code; c0116c6c <__wake_up+4c/c0> <===== 0: c7 01 00 00 00 00 movl $0x0,(%ecx) <===== Code; c0116c72 <__wake_up+52/c0> 6: 8b 41 3c mov 0x3c(%ecx),%eax Code; c0116c75 <__wake_up+55/c0> 9: 85 c0 test %eax,%eax Code; c0116c77 <__wake_up+57/c0> b: 75 2d jne 3a <_EIP+0x3a> c0116ca6 <__wake_up+86/c0> Code; c0116c79 <__wake_up+59/c0> d: a1 c0 d1 26 c0 mov 0xc026d1c0,%eax Code; c0116c7e <__wake_up+5e/c0> 12: 8d 51 00 lea 0x0(%ecx),%edx -- Nicholas Henke Linux Cluster Systems Programmer he...@se... - 215.573.8149 |
From: Nicholas H. <he...@se...> - 2002-10-18 18:17:05
|
bproc_common.h is missing from the rpms: diff -urN clean/bproc.spec bproc-3.2.0/bproc.spec --- clean/bproc.spec 2002-03-26 21:39:08.000000000 -0500 +++ bproc-3.2.0/bproc.spec 2002-10-03 11:10:03.000000000 -0400 @@ -190,7 +191,7 @@ %files devel %defattr(-,root,root) -/usr/include/sys/bproc.h +/usr/include/sys/*.h /usr/lib/libbproc.a /usr/lib/libbproc.so /usr/lib/libbpslave.a Nic -- Nicholas Henke Linux Cluster Systems Programmer he...@se... - 215.573.8149 |
From: Nicholas H. <he...@se...> - 2002-10-18 17:20:13
|
On Fri, 18 Oct 2002 12:53:55 -0400 er...@he... wrote: > Ok, I found a race in the procfs code with PID mapping. Here's a > patch to actually fix the problem. Apply this to bproc/kernel/hooks.c > > It looked like your crash dump may have come from elsewhere so I hope > the problem I'm seeing is the same one you're seeing. In anycase your > little test script stopped killing the slave node after applying this > one. > Thanks! Nic |
From: <er...@he...> - 2002-10-18 17:11:58
|
On Fri, Oct 18, 2002 at 11:18:13AM -0400, er...@he... wrote: > On Fri, Oct 18, 2002 at 10:46:57AM -0400, Nicholas Henke wrote: > > On Fri, 18 Oct 2002 10:25:27 -0400 > > er...@he... wrote: > > > > > On Wed, Oct 16, 2002 at 04:06:26PM -0400, Nicholas Henke wrote: > > > > > > > > I can cause a node to oops everytime with the attached script. I am > > > > running bproc-3.2.0 on 2.4.18, with the procfs locking patch. > > > > Here is the oops fed through ksymoops after rebooting. > > > > > > In general, I'd say I don't worry about co-existing nicely other > > > patches. It's just too much trouble for me. If you point me at the > > > other patch you're using I might be able to see what the problem is > > > pretty quickly though. More likely, the locking is all moved around > > > in procfs and the BProc hooks need to be modified accordingly. The > > > BProc hooks into procfs are pretty messy since things need to be done > > > atomically and there's a bunch of lock grabbing/releasing code already > > > in there. > > > > > Sorry for the confustion. -- it is the patch that you sent me for the > > last oops I sent, where we moved the bproc_hook_imv call inside the lock > > in fs/proc/array.c ( I think that is the file.) > > Oh, I see. I should extend that policy to my own patches :) > > Seriously though, I took your script and reproduced it so I'm looking > at it now. Ok, I found a race in the procfs code with PID mapping. Here's a patch to actually fix the problem. Apply this to bproc/kernel/hooks.c It looked like your crash dump may have come from elsewhere so I hope the problem I'm seeing is the same one you're seeing. In anycase your little test script stopped killing the slave node after applying this one. - Erik diff -u -r1.48 -r1.49 --- hooks.c 27 Sep 2002 19:22:32 -0000 1.48 +++ hooks.c 18 Oct 2002 16:43:29 -0000 1.49 @@ -17,7 +17,7 @@ * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * - * $Id: hooks.c,v 1.48 2002/09/27 19:22:32 hendriks Exp $ + * $Id: hooks.c,v 1.49 2002/10/18 16:43:29 hendriks Exp $ *-----------------------------------------------------------------------*/ #define __NO_VERSION__ @@ -681,12 +681,21 @@ static int bproc_hook_proc_pid(struct task_struct *p) { - return do_pid_mapping() ? p->bproc.masq->pid : p->pid; + if (!do_pid_mapping()) return p->pid; + /* Due to the fact that readdir and opening/reading the process + files is not an atomic operation, possibility that p will no + longer be a masqueraded process by the time we get here. If + that's the case, just return 0 for the pid. */ + if (!BPROC_ISMASQ(p)) return 0; + return p->bproc.masq->pid; } static int bproc_hook_proc_ppid(struct task_struct *p) { - return do_pid_mapping() ? p->bproc.masq->ppid : p->p_opptr->pid; + if (!do_pid_mapping()) return p->pid ? p->p_opptr->pid : 0; + /* See the note in bproc_hook_proc_pid... */ + if (!BPROC_ISMASQ(p)) return 0; + return p->bproc.masq->ppid; } static |
From: <er...@he...> - 2002-10-18 15:36:15
|
On Fri, Oct 18, 2002 at 10:46:57AM -0400, Nicholas Henke wrote: > On Fri, 18 Oct 2002 10:25:27 -0400 > er...@he... wrote: > > > On Wed, Oct 16, 2002 at 04:06:26PM -0400, Nicholas Henke wrote: > > > > > > I can cause a node to oops everytime with the attached script. I am > > > running bproc-3.2.0 on 2.4.18, with the procfs locking patch. > > > Here is the oops fed through ksymoops after rebooting. > > > > In general, I'd say I don't worry about co-existing nicely other > > patches. It's just too much trouble for me. If you point me at the > > other patch you're using I might be able to see what the problem is > > pretty quickly though. More likely, the locking is all moved around > > in procfs and the BProc hooks need to be modified accordingly. The > > BProc hooks into procfs are pretty messy since things need to be done > > atomically and there's a bunch of lock grabbing/releasing code already > > in there. > > > Sorry for the confustion. -- it is the patch that you sent me for the > last oops I sent, where we moved the bproc_hook_imv call inside the lock > in fs/proc/array.c ( I think that is the file.) Oh, I see. I should extend that policy to my own patches :) Seriously though, I took your script and reproduced it so I'm looking at it now. - Erik |
From: Nicholas H. <he...@se...> - 2002-10-18 14:48:13
|
On Fri, 18 Oct 2002 10:25:27 -0400 er...@he... wrote: > On Wed, Oct 16, 2002 at 04:06:26PM -0400, Nicholas Henke wrote: > > > > I can cause a node to oops everytime with the attached script. I am > > running bproc-3.2.0 on 2.4.18, with the procfs locking patch. > > Here is the oops fed through ksymoops after rebooting. > > In general, I'd say I don't worry about co-existing nicely other > patches. It's just too much trouble for me. If you point me at the > other patch you're using I might be able to see what the problem is > pretty quickly though. More likely, the locking is all moved around > in procfs and the BProc hooks need to be modified accordingly. The > BProc hooks into procfs are pretty messy since things need to be done > atomically and there's a bunch of lock grabbing/releasing code already > in there. > Sorry for the confustion. -- it is the patch that you sent me for the last oops I sent, where we moved the bproc_hook_imv call inside the lock in fs/proc/array.c ( I think that is the file.) Nic > - Erik |
From: <er...@he...> - 2002-10-18 14:45:47
|
On Fri, Oct 18, 2002 at 09:29:31AM -0400, Nicholas Henke wrote: > First of all -- thanks for mpirun -- it works great. I have a few > questions about it though: > > When running the round test program, I only see the output from the > first node, not all of them. > > Is is possible to update it to mpich-1.2.4 ? create RPMS for mpich with > mpirun ? mpich-1.2.4: Sure but it'll take some time. I don't personally have time to do it right now. (I presume you're talking about the P4 device.) Ditto for RPMs. It'd be easy enough to do but it'll take a fair amount of time. - Erik |
From: <er...@he...> - 2002-10-18 14:43:29
|
On Wed, Oct 16, 2002 at 04:06:26PM -0400, Nicholas Henke wrote: > > I can cause a node to oops everytime with the attached script. I am > running bproc-3.2.0 on 2.4.18, with the procfs locking patch. > Here is the oops fed through ksymoops after rebooting. In general, I'd say I don't worry about co-existing nicely other patches. It's just too much trouble for me. If you point me at the other patch you're using I might be able to see what the problem is pretty quickly though. More likely, the locking is all moved around in procfs and the BProc hooks need to be modified accordingly. The BProc hooks into procfs are pretty messy since things need to be done atomically and there's a bunch of lock grabbing/releasing code already in there. - Erik > Unable to handle kernel NULL pointer dereference at virtual address > 00000010 > *pde = 00000000 > OOps: ---- > CPU: 0 > EIP: 0010:[<e095357f>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010202 > Process ps (pid: 788, stackpage=de02b000) > Call Trace: [<c015c410>] [<c01343e1>] [<c015a593>] [<c013b1f6>] > [<c010734b>] > Code: 8b 40 10 c3 90 8b 82 94 00 00 00 8b 40 7c c3 89 f6 f6 50 20 > > >>EIP; e095357f <[bproc]send_recv_process+ff/145> <===== > Trace; c015c410 <proc_pid_status+110/3f0> > Trace; c01343e1 <__alloc_pages+41/180> > Trace; c015a593 <proc_info_read+63/120> > Trace; c013b1f6 <sys_read+96/120> > Trace; c010734b <system_call+33/38> > Code; e095357f <[bproc]send_recv_process+ff/145> > 00000000 <_EIP>: > Code; e095357f <[bproc]send_recv_process+ff/145> <===== > 0: 8b 40 10 mov 0x10(%eax),%eax <===== > Code; e0953582 <[bproc]send_recv_process+102/145> > 3: c3 ret > Code; e0953583 <[bproc]send_recv_process+103/145> > 4: 90 nop > Code; e0953584 <[bproc]send_recv_process+104/145> > 5: 8b 82 94 00 00 00 mov 0x94(%edx),%eax > Code; e095358a <[bproc]send_recv_process+10a/145> > b: 8b 40 7c mov 0x7c(%eax),%eax > Code; e095358d <[bproc]send_recv_process+10d/145> > e: c3 ret > Code; e095358e <[bproc]send_recv_process+10e/145> > f: 89 f6 mov %esi,%esi > Code; e0953590 <[bproc]send_recv_process+110/145> > 11: f6 50 20 notb 0x20(%eax) > > > Nic > -- > Nicholas Henke > Linux Cluster Systems Programmer > he...@se... - 215.573.8149 |
From: Nicholas H. <he...@se...> - 2002-10-18 13:30:48
|
First of all -- thanks for mpirun -- it works great. I have a few questions about it though: When running the round test program, I only see the output from the first node, not all of them. Is is possible to update it to mpich-1.2.4 ? create RPMS for mpich with mpirun ? Thanks! Nic -- Nicholas Henke Linux Cluster Systems Programmer he...@se... - 215.573.8149 |
From: Nicholas H. <he...@se...> - 2002-10-17 15:38:35
|
I can cause a node to oops everytime with the attached script. I am running bproc-3.2.0 on 2.4.18, with the procfs locking patch. Here is the oops fed through ksymoops after rebooting. Unable to handle kernel NULL pointer dereference at virtual address 00000010 *pde = 00000000 OOps: ---- CPU: 0 EIP: 0010:[<e095357f>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010202 Process ps (pid: 788, stackpage=de02b000) Call Trace: [<c015c410>] [<c01343e1>] [<c015a593>] [<c013b1f6>] [<c010734b>] Code: 8b 40 10 c3 90 8b 82 94 00 00 00 8b 40 7c c3 89 f6 f6 50 20 >>EIP; e095357f <[bproc]send_recv_process+ff/145> <===== Trace; c015c410 <proc_pid_status+110/3f0> Trace; c01343e1 <__alloc_pages+41/180> Trace; c015a593 <proc_info_read+63/120> Trace; c013b1f6 <sys_read+96/120> Trace; c010734b <system_call+33/38> Code; e095357f <[bproc]send_recv_process+ff/145> 00000000 <_EIP>: Code; e095357f <[bproc]send_recv_process+ff/145> <===== 0: 8b 40 10 mov 0x10(%eax),%eax <===== Code; e0953582 <[bproc]send_recv_process+102/145> 3: c3 ret Code; e0953583 <[bproc]send_recv_process+103/145> 4: 90 nop Code; e0953584 <[bproc]send_recv_process+104/145> 5: 8b 82 94 00 00 00 mov 0x94(%edx),%eax Code; e095358a <[bproc]send_recv_process+10a/145> b: 8b 40 7c mov 0x7c(%eax),%eax Code; e095358d <[bproc]send_recv_process+10d/145> e: c3 ret Code; e095358e <[bproc]send_recv_process+10e/145> f: 89 f6 mov %esi,%esi Code; e0953590 <[bproc]send_recv_process+110/145> 11: f6 50 20 notb 0x20(%eax) Nic -- Nicholas Henke Linux Cluster Systems Programmer he...@se... - 215.573.8149 |
From: Andrew S. <sh...@in...> - 2002-10-16 19:47:30
|
On Wed, 16 Oct 2002 13:24:33 -0600 Wilton Wong <ww...@ha...> wrote: > Add " around 3.2.1 then gcc should be happier.. it will then interpret the > number as a string and not as a number.. So did versions of gcc before 3.2 treat string variables differently? It seems odd to me that it requires both single and double quotes and not just one or the other (e.g. -DPACKAGE_VERSION="'3.2.1'"). Is it a gcc bug? Andrew -- Andrew Shewmaker Associate Engineer:q:q Phone: 208.526.1415 Fax: 208.526.4017 Idaho National Engineering and Environmental Laboratory 2525 Fremont Ave. Idaho Falls, ID 83415-3605 |
From: Wilton W. <ww...@ha...> - 2002-10-16 19:24:58
|
Add " around 3.2.1 then gcc should be happier.. it will then interpret the number as a string and not as a number.. - Wilton On Wed, 16 Oct 2002, Andrew Shewmaker` wrote: > On Thu, 10 Oct 2002 17:48:08 -0400 > er...@he... wrote: > > > On Thu, Oct 10, 2002 at 02:37:04PM -0600, Andrew Shewmaker` wrote: > > > I am attempting to use bproc-3.2.1 with Mandrake 9.0 (kernel 2.4.19-16mdk and gcc 3.2) > > > I have successfully patched and compiled Mandrake's kernel with bproc, which meant I > > > had to do some of it by hand since the grsecurity stuff confused patch. Then I tried to > > > compile the bproc tools and I see the following error. > > > > > > ghost.c:63: initializer element is not constant > > > ghost.c:63: (near initialization for `bproc_ghost_reqs') > > > ghost.c:63: initializer element is not constant > > > > > > I assume this is simply due to gcc 3.2, and I will switch to an older compiler if I > > > have to, but if someone could suggest a simple fix I would appreciate it. > > > > I have no idea how a constant can become non constant when changing > > compilers. Anyway, I worked around it. Patch below (against bproc 3.2.1): > > > Thanks for the patch (I haven't tried J.A. Magallon's). I've run into some other weird > problems that I was able to work around. Basically, it looks like gcc 3.2 was unhappy with > PACKAGE_VERSION having more than one decimal, and it said that there was a parse error > caused by PACKAGE_VERSION in the call to MODULE_DESCRIPTION. It looked to me like the patch > applied cleanly (except for unimportant revision control lines). > > gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -I. -I../vmadump -I../clients -I/usr/src/linux-2.4.19-16mdk-bproc/include -pipe -fno-strength-reduce -D__KERNEL__ -DMODULE -DPACKAGE_VERSION='3.2.1' -DPACKAGE_MAGIC='11598' -DENABLE_DEBUG -DLINUX_TCP_IS_BROKEN -DSUPPORT_FILEREQ -c interface.c > interface.c:52: too many decimal points in floating constant > interface.c:52: parse error before numeric constant > interface.c: In function `bproc_version_check': > interface.c:1170: too many decimal points in floating constant > interface.c:1170: warning: missing braces around initializer > interface.c:1170: warning: (near initialization for `vers.version_string') > interface.c: In function `init_module': > interface.c:1396: too many decimal points in floating constant > interface.c:1396: warning: format argument is not a pointer (arg 2) > interface.c:1396: warning: format argument is not a pointer (arg 2) > make[1]: *** [interface.o] Error 1 > make[1]: Leaving directory `/usr/local/src/bproc-3.2.1/kernel' > make: *** [kernel] Error 2 > > > Changing PACKAGE_VERSION to 3.2 > > gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -I. -I../vmadump -I../clients -I/usr/src/linux-2.4.19-16mdk-bproc/include -pipe -fno-strength-reduce -D__KERNEL__ -DMODULE -DPACKAGE_VERSION='3.2' -DPACKAGE_MAGIC='11598' -DENABLE_DEBUG -DLINUX_TCP_IS_BROKEN -DSUPPORT_FILEREQ -c interface.c > interface.c:52: parse error before numeric constant > interface.c: In function `bproc_version_check': > interface.c:1170: warning: missing braces around initializer > interface.c:1170: warning: (near initialization for `vers.version_string') > interface.c: In function `init_module': > interface.c:1396: warning: format argument is not a pointer (arg 2) > interface.c:1396: warning: format argument is not a pointer (arg 2) > make[1]: *** [interface.o] Error 1 > make[1]: Leaving directory `/usr/local/src/bproc-3.2.1/kernel' > make: *** [kernel] Error 2 > > > So I removed PACKAGE_VERSION from the call to MODULE_DESCRIPTION and interface.c compiled. > > > Andrew > > -- > Andrew Shewmaker > Associate Engineer > Phone: 208.526.1415 > Fax: 208.526.4017 > > Idaho National Engineering and Environmental Laboratory > 2525 Fremont Ave. > Idaho Falls, ID 83415-3605 > > > ------------------------------------------------------- > This sf.net email is sponsored by: viaVerio will pay you up to > $1,000 for every account that you consolidate with us. > http://ad.doubleclick.net/clk;4749864;7604308;v? > http://www.viaverio.com/consolidator/osdn.cfm > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: Andrew S. <sh...@in...> - 2002-10-16 19:21:56
|
On Thu, 10 Oct 2002 17:48:08 -0400 er...@he... wrote: > On Thu, Oct 10, 2002 at 02:37:04PM -0600, Andrew Shewmaker` wrote: > > I am attempting to use bproc-3.2.1 with Mandrake 9.0 (kernel 2.4.19-16mdk and gcc 3.2) > > I have successfully patched and compiled Mandrake's kernel with bproc, which meant I > > had to do some of it by hand since the grsecurity stuff confused patch. Then I tried to > > compile the bproc tools and I see the following error. > > > > ghost.c:63: initializer element is not constant > > ghost.c:63: (near initialization for `bproc_ghost_reqs') > > ghost.c:63: initializer element is not constant > > > > I assume this is simply due to gcc 3.2, and I will switch to an older compiler if I > > have to, but if someone could suggest a simple fix I would appreciate it. > > I have no idea how a constant can become non constant when changing > compilers. Anyway, I worked around it. Patch below (against bproc 3.2.1): Thanks for the patch (I haven't tried J.A. Magallon's). I've run into some other weird problems that I was able to work around. Basically, it looks like gcc 3.2 was unhappy with PACKAGE_VERSION having more than one decimal, and it said that there was a parse error caused by PACKAGE_VERSION in the call to MODULE_DESCRIPTION. It looked to me like the patch applied cleanly (except for unimportant revision control lines). gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -I. -I../vmadump -I../clients -I/usr/src/linux-2.4.19-16mdk-bproc/include -pipe -fno-strength-reduce -D__KERNEL__ -DMODULE -DPACKAGE_VERSION='3.2.1' -DPACKAGE_MAGIC='11598' -DENABLE_DEBUG -DLINUX_TCP_IS_BROKEN -DSUPPORT_FILEREQ -c interface.c interface.c:52: too many decimal points in floating constant interface.c:52: parse error before numeric constant interface.c: In function `bproc_version_check': interface.c:1170: too many decimal points in floating constant interface.c:1170: warning: missing braces around initializer interface.c:1170: warning: (near initialization for `vers.version_string') interface.c: In function `init_module': interface.c:1396: too many decimal points in floating constant interface.c:1396: warning: format argument is not a pointer (arg 2) interface.c:1396: warning: format argument is not a pointer (arg 2) make[1]: *** [interface.o] Error 1 make[1]: Leaving directory `/usr/local/src/bproc-3.2.1/kernel' make: *** [kernel] Error 2 Changing PACKAGE_VERSION to 3.2 gcc -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -I. -I../vmadump -I../clients -I/usr/src/linux-2.4.19-16mdk-bproc/include -pipe -fno-strength-reduce -D__KERNEL__ -DMODULE -DPACKAGE_VERSION='3.2' -DPACKAGE_MAGIC='11598' -DENABLE_DEBUG -DLINUX_TCP_IS_BROKEN -DSUPPORT_FILEREQ -c interface.c interface.c:52: parse error before numeric constant interface.c: In function `bproc_version_check': interface.c:1170: warning: missing braces around initializer interface.c:1170: warning: (near initialization for `vers.version_string') interface.c: In function `init_module': interface.c:1396: warning: format argument is not a pointer (arg 2) interface.c:1396: warning: format argument is not a pointer (arg 2) make[1]: *** [interface.o] Error 1 make[1]: Leaving directory `/usr/local/src/bproc-3.2.1/kernel' make: *** [kernel] Error 2 So I removed PACKAGE_VERSION from the call to MODULE_DESCRIPTION and interface.c compiled. Andrew -- Andrew Shewmaker Associate Engineer Phone: 208.526.1415 Fax: 208.526.4017 Idaho National Engineering and Environmental Laboratory 2525 Fremont Ave. Idaho Falls, ID 83415-3605 |
From: Wilton W. <ww...@ha...> - 2002-10-16 02:47:20
|
Well it looks like a hardware bug.. the built on ethernet on the nvidia, nforce boards the 220d's I am guessing have problems, I tired some intel nics we had kicking around and they work just fine. Now I am not sure why it is screwing up when transfering multicast data to more than one client (or if it is screwing up on the server side or the client side) I will test some more tomorrow and see. - Wilton ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: Wilton W. <ww...@ha...> - 2002-10-15 23:13:15
|
That seems to "fix" the problem.. but I suppose I will have to dig a bit deeper to find out exactly why sched_setscheduler() to priority 1 and SCHED_FIFO would make the process unresponsive, it should do the exact opposite. - Wilton On Thu, 10 Oct 2002, Erik Arjan Hendriks wrote: > p.sched_priority = 1; > if (sched_setscheduler(0, SCHED_FIFO, &p)) > syslog(LOG_NOTICE, "Failed to set real-time scheduling for" > " slave daemon.\n"); ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: <er...@he...> - 2002-10-15 16:20:39
|
On Sun, Oct 13, 2002 at 08:17:59PM +0200, Ana Bosque wrote: > Hi folks, > > We have been working with a bproc cluster with mpich for some months. > Now we want to put more than one mpi process in the front-end of the > system, so we make the front-end master and slave at the same time and > everything is ok. But when we execute a mpi program that need to put a > mpi process in the front-end (+ the process that it's been executed > locally in the front-end) it just hangs in the MPI_Init instruction. > > Has anyone any idea? First a few questions: What version of BProc are you running? Which MPICH modification? It sounds like you might be using the old Scyld MPI hack since none of the other hacks that I know of place any processes on the front end. That's a very unusual thing to want to do - especially in clusters of any size. - Erik |
From: <er...@he...> - 2002-10-15 15:55:47
|
On Fri, Oct 11, 2002 at 12:51:19AM -0600, Wilton Wong wrote: > Are the 50+ machines that are booting coming up at the same time ? or are they > different speeds ? we are using NForce (yucky) board that come up pretty quick > using PXE, and they come up almost simultaniously.. all of the nodes (well my > current 3 test nodes) are on the same subnet.. same switch... the only real > difference from a default install is that we are starting node numbering from > node 1 rather than 0 ( node 1 = 192.168.21.1.. etc..) and the master node sits > on the end of the subnet 192.168.21.254 Yeah, basically simultaneously. We're using myrinet and they all get mapped at about the same time. When I say 50+ at a time, that's the number that the node_up program says it's going at once. e.g. "Sep 17 13:57:47 xed beoserv: Starting node_up worker for 63 clients. " - Erik > On Thu, 10 Oct 2002, Erik Arjan Hendriks wrote: > > > It works fine for me. I've seen node_up do 50+ nodes at a time on our > > cluster here. The only hidden gotcha that I can think of with vrfork > > is that the nodes need to be able to reach one another w/ IP. So, if > > they're on different subnets, etc. you're going to have some trouble. > > The default route that the boot code puts in is bogus. > > ----[ Wilton William Wong ]--------------------------------------------- > 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX > Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions > T5X 1Y3, Canada URL: http://www.harddata.com > -------------------------------------------------------[ Hard Data Ltd. ]---- > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users |
From: Ana B. <an...@iv...> - 2002-10-13 18:18:22
|
Hi folks, We have been working with a bproc cluster with mpich for some months. Now we want to put more than one mpi process in the front-end of the system, so we make the front-end master and slave at the same time and everything is ok. But when we execute a mpi program that need to put a mpi process in the front-end (+ the process that it's been executed locally in the front-end) it just hangs in the MPI_Init instruction. Has anyone any idea? ANA P.S. We use mpich 1.2.4 and we don't use the mpirun of the tarball. |
From: Wilton W. <ww...@ha...> - 2002-10-11 06:51:47
|
Are the 50+ machines that are booting coming up at the same time ? or are they different speeds ? we are using NForce (yucky) board that come up pretty quick using PXE, and they come up almost simultaniously.. all of the nodes (well my current 3 test nodes) are on the same subnet.. same switch... the only real difference from a default install is that we are starting node numbering from node 1 rather than 0 ( node 1 = 192.168.21.1.. etc..) and the master node sits on the end of the subnet 192.168.21.254 - Wilton On Thu, 10 Oct 2002, Erik Arjan Hendriks wrote: > It works fine for me. I've seen node_up do 50+ nodes at a time on our > cluster here. The only hidden gotcha that I can think of with vrfork > is that the nodes need to be able to reach one another w/ IP. So, if > they're on different subnets, etc. you're going to have some trouble. > The default route that the boot code puts in is bogus. ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: J.A. M. <jam...@ab...> - 2002-10-10 22:33:23
|
On 2002.10.10 er...@he... wrote: >On Thu, Oct 10, 2002 at 02:37:04PM -0600, Andrew Shewmaker` wrote: >> I am attempting to use bproc-3.2.1 with Mandrake 9.0 (kernel 2.4.19-16mdk and gcc 3.2) >> I have successfully patched and compiled Mandrake's kernel with bproc, which meant I >> had to do some of it by hand since the grsecurity stuff confused patch. Then I tried to >> compile the bproc tools and I see the following error. >> >> ghost.c:63: initializer element is not constant >> ghost.c:63: (near initialization for `bproc_ghost_reqs') >> ghost.c:63: initializer element is not constant >> >> I assume this is simply due to gcc 3.2, and I will switch to an older compiler if I >> have to, but if someone could suggest a simple fix I would appreciate it. > >I have no idea how a constant can become non constant when changing >compilers. Anyway, I worked around it. Patch below (against bproc 3.2.1): > Much simpler solution (at least for that problem, do not know anything about constant strings and newlines): diff -ruN bproc-3.1.9/kernel/bproc.h bproc-3.1.9-j/kernel/bproc.h --- bproc/kernel/bproc.h 2002-02-19 23:25:47.000000000 +0100 +++ bproc-j/kernel/bproc.h 2002-03-29 11:52:43.000000000 +0100 @@ -582,10 +582,12 @@ #define BPROC_DEADREQ(r) ((r)->req.req == 0) #define BPROC_PENDING(r) ((!BPROC_DEADREQ(r))&&(!BPROC_ISRESPONSE((r)->req.req))) -#define EMPTY_BPROC_REQUEST_QUEUE(foo) \ - ((struct bproc_request_queue_t) {SPIN_LOCK_UNLOCKED,0, \ +#define EMPTY_BPROC_REQUEST_QUEUE_STATIC(foo) \ + {SPIN_LOCK_UNLOCKED,0, \ LIST_HEAD_INIT((foo).list),__WAIT_QUEUE_HEAD_INITIALIZER((foo).wait),\ - LIST_HEAD_INIT((foo).pending)}) + LIST_HEAD_INIT((foo).pending)} +#define EMPTY_BPROC_REQUEST_QUEUE(foo) \ + ((struct bproc_request_queue_t) EMPTY_BPROC_REQUEST_QUEUE_STATIC(foo)) extern atomic_t msg_count; static inline diff -ruN bproc-3.1.9/kernel/ghost.c bproc-3.1.9-j/kernel/ghost.c --- bproc/kernel/ghost.c 2002-03-08 20:26:31.000000000 +0100 +++ bproc-j/kernel/ghost.c 2002-03-29 11:52:59.000000000 +0100 @@ -60,7 +60,7 @@ DECLARE_WAIT_QUEUE_HEAD(ghost_wait); struct bproc_request_queue_t bproc_ghost_reqs = - EMPTY_BPROC_REQUEST_QUEUE(bproc_ghost_reqs); + EMPTY_BPROC_REQUEST_QUEUE_STATIC(bproc_ghost_reqs); int ghost_deliver_msg(pid_t pid, struct bproc_krequest_t *req) { -- J.A. Magallon <jam...@ab...> \ Software is like sex: werewolf.able.es \ It's better when it's free Mandrake Linux release 9.1 (Cooker) for i586 Linux 2.4.20-pre10-jam1 (gcc 3.2 (Mandrake Linux 9.0 3.2-2mdk)) |
From: <er...@he...> - 2002-10-10 22:04:35
|
On Thu, Oct 10, 2002 at 02:37:04PM -0600, Andrew Shewmaker` wrote: > I am attempting to use bproc-3.2.1 with Mandrake 9.0 (kernel 2.4.19-16mdk and gcc 3.2) > I have successfully patched and compiled Mandrake's kernel with bproc, which meant I > had to do some of it by hand since the grsecurity stuff confused patch. Then I tried to > compile the bproc tools and I see the following error. > > ghost.c:63: initializer element is not constant > ghost.c:63: (near initialization for `bproc_ghost_reqs') > ghost.c:63: initializer element is not constant > > I assume this is simply due to gcc 3.2, and I will switch to an older compiler if I > have to, but if someone could suggest a simple fix I would appreciate it. I have no idea how a constant can become non constant when changing compilers. Anyway, I worked around it. Patch below (against bproc 3.2.1): - Erik Index: bproc/ChangeLog diff -c bproc/ChangeLog:1.83 bproc/ChangeLog:1.84 *** bproc/ChangeLog:1.83 Mon Sep 30 12:42:31 2002 --- bproc/ChangeLog Thu Oct 3 11:32:13 2002 *************** *** 1,4 **** ! Changes from 3.2.0 to 3.2.1 * Fixed an oddity in bproc_proclist and bproc_nodelist where it would still allocate memory even if the return value is zero. --- 1,8 ---- ! Changes from 3.2.1 to 3.2.2 ! ! * Fixed build problems and warnings with gcc 3.2. ! ! Changes from 3.2.0 to 3.2.1 * Fixed an oddity in bproc_proclist and bproc_nodelist where it would still allocate memory even if the return value is zero. Index: bproc/clients/bsh.c diff -c bproc/clients/bsh.c:1.11 bproc/clients/bsh.c:1.12 *** bproc/clients/bsh.c:1.11 Wed Aug 29 00:55:36 2001 --- bproc/clients/bsh.c Thu Oct 3 11:32:13 2002 *************** *** 17,23 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * ! * $Id: bsh.c,v 1.11 2001/08/29 04:55:36 hendriks Exp $ *-----------------------------------------------------------------------*/ #include <stdio.h> #include <stdlib.h> --- 17,23 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * ! * $Id: bsh.c,v 1.12 2002/10/03 15:32:13 hendriks Exp $ *-----------------------------------------------------------------------*/ #include <stdio.h> #include <stdlib.h> *************** *** 59,65 **** fprintf(stderr, "No ghost master present.\n"); break; default: ! fprintf(stderr, "%s\n", sys_errlist[errno]); } exit(1); --- 59,65 ---- fprintf(stderr, "No ghost master present.\n"); break; default: ! fprintf(stderr, "%s\n", strerror(errno)); } exit(1); Index: bproc/daemons/iod.c diff -c bproc/daemons/iod.c:1.18 bproc/daemons/iod.c:1.19 *** bproc/daemons/iod.c:1.18 Sun Oct 14 02:47:11 2001 --- bproc/daemons/iod.c Thu Oct 3 11:32:13 2002 *************** *** 17,23 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * ! * $Id: iod.c,v 1.18 2001/10/14 06:47:11 hendriks Exp $ *-----------------------------------------------------------------------*/ #include <stdio.h> #include <stdlib.h> --- 17,23 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * ! * $Id: iod.c,v 1.19 2002/10/03 15:32:13 hendriks Exp $ *-----------------------------------------------------------------------*/ #include <stdio.h> #include <stdlib.h> *************** *** 264,270 **** r = select(maxfd+1, &rset, &wset, 0, 0); if (r == -1) { if (errno == EINTR) continue; ! syslog(LOG_ERR, "iod: select: %s\n", sys_errlist[errno]); exit(1); } if (r > 0) { --- 264,270 ---- r = select(maxfd+1, &rset, &wset, 0, 0); if (r == -1) { if (errno == EINTR) continue; ! syslog(LOG_ERR, "iod: select: %s\n", strerror(errno)); exit(1); } if (r > 0) { *************** *** 302,308 **** pid = fork(); if (pid == -1) { ! syslog(LOG_ERR, "Failed to start IO daemon: %s\n", sys_errlist[errno]); exit(1); } if (pid == 0) { --- 302,308 ---- pid = fork(); if (pid == -1) { ! syslog(LOG_ERR, "Failed to start IO daemon: %s\n", strerror(errno)); exit(1); } if (pid == 0) { Index: bproc/daemons/master.c diff -c bproc/daemons/master.c:1.121 bproc/daemons/master.c:1.122 *** bproc/daemons/master.c:1.121 Mon Aug 5 18:56:09 2002 --- bproc/daemons/master.c Thu Oct 3 11:32:13 2002 *************** *** 17,23 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * ! * $Id: master.c,v 1.121 2002/08/05 22:56:09 hendriks Exp $ *-----------------------------------------------------------------------*/ #include <sys/types.h> #include <sys/stat.h> --- 17,23 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * ! * $Id: master.c,v 1.122 2002/10/03 15:32:13 hendriks Exp $ *-----------------------------------------------------------------------*/ #include <sys/types.h> #include <sys/stat.h> *************** *** 901,907 **** pid = fork(); if (pid == -1) { syslog(LOG_ERR, "failed to run setup script for node %d\nfork: %s\n", ! s->rank, sys_errlist[errno]); return; } if (pid == 0) { --- 901,907 ---- pid = fork(); if (pid == -1) { syslog(LOG_ERR, "failed to run setup script for node %d\nfork: %s\n", ! s->rank, strerror(errno)); return; } if (pid == 0) { *************** *** 1246,1252 **** void set_keep_alive(int fd) { int flag = 1; if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &flag, sizeof(flag)) == -1) { ! syslog(LOG_ERR, "setsockopt: %s", sys_errlist[errno]); } } --- 1246,1252 ---- void set_keep_alive(int fd) { int flag = 1; if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &flag, sizeof(flag)) == -1) { ! syslog(LOG_ERR, "setsockopt: %s", strerror(errno)); } } *************** *** 1255,1261 **** void set_no_delay(int fd) { int flag = 1; if (setsockopt(fd, SOL_TCP, TCP_NODELAY, &flag, sizeof(flag)) == -1) { ! syslog(LOG_ERR, "setsockopt: %s", sys_errlist[errno]); } } --- 1255,1261 ---- void set_no_delay(int fd) { int flag = 1; if (setsockopt(fd, SOL_TCP, TCP_NODELAY, &flag, sizeof(flag)) == -1) { ! syslog(LOG_ERR, "setsockopt: %s", strerror(errno)); } } *************** *** 1349,1355 **** int remsize = sizeof(remote); slavefd = accept(ifc->fd, (struct sockaddr *) &remote, &remsize); if (slavefd == -1) { ! syslog(LOG_ERR, "accept: %s", sys_errlist[errno]); return -1; } --- 1349,1355 ---- int remsize = sizeof(remote); slavefd = accept(ifc->fd, (struct sockaddr *) &remote, &remsize); if (slavefd == -1) { ! syslog(LOG_ERR, "accept: %s", strerror(errno)); return -1; } *************** *** 1800,1806 **** if (r == -1) { syslog(LOG_CRIT, "write(ghost): error %s; req=%d", ! sys_errlist[errno], req->req.req); } else { if (r != sizeof(req->req)) syslog(LOG_CRIT,"write(ghost): short write; ignoring (Aaaieee!!)"); --- 1800,1806 ---- if (r == -1) { syslog(LOG_CRIT, "write(ghost): error %s; req=%d", ! strerror(errno), req->req.req); } else { if (r != sizeof(req->req)) syslog(LOG_CRIT,"write(ghost): short write; ignoring (Aaaieee!!)"); *************** *** 2058,2064 **** if (!ignore_version) return -1; } if ((ghostfd = syscall(__NR_bproc, BPROC_SYS_MASTER)) == -1) { ! syslog(LOG_ERR, "BPROC_SYS_MASTER: %s", sys_errlist[errno]); return -1; } set_non_block(ghostfd); --- 2058,2064 ---- if (!ignore_version) return -1; } if ((ghostfd = syscall(__NR_bproc, BPROC_SYS_MASTER)) == -1) { ! syslog(LOG_ERR, "BPROC_SYS_MASTER: %s", strerror(errno)); return -1; } set_non_block(ghostfd); *************** *** 2251,2257 **** r = select(maxfd+1, &rset, &wset, 0, &timeleft); if (r == -1) { if (errno == EINTR) continue; ! syslog(LOG_ERR, "select: %s", sys_errlist[errno]); exit(1); } /* Block the update signals while doing work. */ --- 2251,2257 ---- r = select(maxfd+1, &rset, &wset, 0, &timeleft); if (r == -1) { if (errno == EINTR) continue; ! syslog(LOG_ERR, "select: %s", strerror(errno)); exit(1); } /* Block the update signals while doing work. */ Index: bproc/kernel/bproc.h diff -c bproc/kernel/bproc.h:1.84 bproc/kernel/bproc.h:1.85 *** bproc/kernel/bproc.h:1.84 Mon Aug 5 18:56:10 2002 --- bproc/kernel/bproc.h Thu Oct 3 11:32:13 2002 *************** *** 16,22 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: bproc.h,v 1.84 2002/08/05 22:56:10 hendriks Exp $ *-----------------------------------------------------------------------*/ #ifndef _BPROC_H #define _BPROC_H --- 16,22 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: bproc.h,v 1.85 2002/10/03 15:32:13 hendriks Exp $ *-----------------------------------------------------------------------*/ #ifndef _BPROC_H #define _BPROC_H *************** *** 654,659 **** --- 654,660 ---- ** Functions for sending/receiving requests **----------------------------------------------------------------------*/ /* kernel/msg.c */ + extern void bproc_init_request_queue (struct bproc_request_queue_t *q); extern void bproc_close_request_queue(struct bproc_request_queue_t *q); extern int bproc_deliver_response (struct bproc_request_queue_t *pending, struct bproc_krequest_t *req); extern struct bproc_krequest_t *bproc_next_req (struct bproc_request_queue_t *me); Index: bproc/kernel/ghost.c diff -c bproc/kernel/ghost.c:1.100 bproc/kernel/ghost.c:1.101 *** bproc/kernel/ghost.c:1.100 Tue Aug 6 11:42:57 2002 --- bproc/kernel/ghost.c Thu Oct 3 11:32:14 2002 *************** *** 17,23 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: ghost.c,v 1.100 2002/08/06 15:42:57 hendriks Exp $ *-----------------------------------------------------------------------*/ #define __NO_VERSION__ --- 17,23 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: ghost.c,v 1.101 2002/10/03 15:32:14 hendriks Exp $ *-----------------------------------------------------------------------*/ #define __NO_VERSION__ *************** *** 59,66 **** LIST_HEAD(ghost_list); DECLARE_WAIT_QUEUE_HEAD(ghost_wait); ! struct bproc_request_queue_t bproc_ghost_reqs = ! EMPTY_BPROC_REQUEST_QUEUE(bproc_ghost_reqs); int ghost_deliver_msg(pid_t pid, struct bproc_krequest_t *req) { --- 59,65 ---- LIST_HEAD(ghost_list); DECLARE_WAIT_QUEUE_HEAD(ghost_wait); ! struct bproc_request_queue_t bproc_ghost_reqs; int ghost_deliver_msg(pid_t pid, struct bproc_krequest_t *req) { *************** *** 94,100 **** ghost->count = (atomic_t) ATOMIC_INIT(1); ghost->pid = current->pid; ghost->sigbypass = 0; ! ghost->req = EMPTY_BPROC_REQUEST_QUEUE(ghost->req); ghost->last_response = 0; ghost->wait=(wait_queue_head_t)__WAIT_QUEUE_HEAD_INITIALIZER(ghost->wait); ghost->state = TASK_RUNNING; --- 93,99 ---- ghost->count = (atomic_t) ATOMIC_INIT(1); ghost->pid = current->pid; ghost->sigbypass = 0; ! bproc_init_request_queue(&ghost->req); ghost->last_response = 0; ghost->wait=(wait_queue_head_t)__WAIT_QUEUE_HEAD_INITIALIZER(ghost->wait); ghost->state = TASK_RUNNING; Index: bproc/kernel/interface.c diff -c bproc/kernel/interface.c:1.80 bproc/kernel/interface.c:1.81 *** bproc/kernel/interface.c:1.80 Thu Sep 26 10:58:28 2002 --- bproc/kernel/interface.c Thu Oct 3 11:32:14 2002 *************** *** 16,22 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: interface.c,v 1.80 2002/09/26 14:58:28 hendriks Exp $ *-----------------------------------------------------------------------*/ --- 16,22 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: interface.c,v 1.81 2002/10/03 15:32:14 hendriks Exp $ *-----------------------------------------------------------------------*/ *************** *** 1393,1398 **** --- 1393,1399 ---- printk(KERN_INFO "bproc: Beowulf Distributed Process Space Version %s\n" KERN_INFO "bproc: (C) 1999-2002 Erik Hendriks <er...@he...>\n" , PACKAGE_VERSION); + bproc_init_request_queue(&bproc_ghost_reqs); bproc_close_request_queue(&bproc_ghost_reqs); /* Until we get a master. */ /* Don't automagically MOD_INC_USE_COUNT for me, please. */ Index: bproc/kernel/msg.c diff -c bproc/kernel/msg.c:1.20 bproc/kernel/msg.c:1.21 *** bproc/kernel/msg.c:1.20 Mon Aug 5 12:37:01 2002 --- bproc/kernel/msg.c Thu Oct 3 11:32:14 2002 *************** *** 17,23 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: msg.c,v 1.20 2002/08/05 16:37:01 hendriks Exp $ *-----------------------------------------------------------------------*/ #define __NO_VERSION__ --- 17,23 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: msg.c,v 1.21 2002/10/03 15:32:14 hendriks Exp $ *-----------------------------------------------------------------------*/ #define __NO_VERSION__ *************** *** 307,312 **** --- 307,325 ---- err = bproc_send_req(reqdest, req); if (err) return err; return bproc_response_wait(req, MAX_SCHEDULE_TIMEOUT, 0); + } + + #define EMPTY_BPROC_REQUEST_QUEUE(foo) \ + ((struct bproc_request_queue_t) {SPIN_LOCK_UNLOCKED,0, \ + LIST_HEAD_INIT((foo).list),__WAIT_QUEUE_HEAD_INITIALIZER((foo).wait),\ + LIST_HEAD_INIT((foo).pending)}) + + void bproc_init_request_queue(struct bproc_request_queue_t *q) { + spin_lock_init(&q->lock); + q->closing = 0; + INIT_LIST_HEAD(&q->list); + init_waitqueue_head(&q->wait); + INIT_LIST_HEAD(&q->pending); } void bproc_close_request_queue(struct bproc_request_queue_t *q) { Index: vmadump/vmadump.c diff -c vmadump/vmadump.c:1.63 vmadump/vmadump.c:1.64 *** vmadump/vmadump.c:1.63 Thu Sep 26 10:58:28 2002 --- vmadump/vmadump.c Thu Oct 3 11:32:14 2002 *************** *** 17,23 **** * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: vmadump.c,v 1.63 2002/09/26 14:58:28 hendriks Exp $ *-----------------------------------------------------------------------*/ #define EXPORT_SYMTAB --- 17,23 ---- * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * ! * $Id: vmadump.c,v 1.64 2002/10/03 15:32:14 hendriks Exp $ *-----------------------------------------------------------------------*/ #define EXPORT_SYMTAB *************** *** 1480,1595 **** #define SWITCH_STACK_SIZE "320" int sys_vmadump(void); /* do_switch_stack/undo_switch_stack stolen from arch/alpha/kernel/entry.S */ ! asm(" ! .align 3 ! .ent save_switch_stack ! save_switch_stack: ! lda $30,-"SWITCH_STACK_SIZE"($30) ! stq $9,0($30) ! stq $10,8($30) ! stq $11,16($30) ! stq $12,24($30) ! stq $13,32($30) ! stq $14,40($30) ! stq $15,48($30) ! stq $26,56($30) ! stt $f0,64($30) ! stt $f1,72($30) ! stt $f2,80($30) ! stt $f3,88($30) ! stt $f4,96($30) ! stt $f5,104($30) ! stt $f6,112($30) ! stt $f7,120($30) ! stt $f8,128($30) ! stt $f9,136($30) ! stt $f10,144($30) ! stt $f11,152($30) ! stt $f12,160($30) ! stt $f13,168($30) ! stt $f14,176($30) ! stt $f15,184($30) ! stt $f16,192($30) ! stt $f17,200($30) ! stt $f18,208($30) ! stt $f19,216($30) ! stt $f20,224($30) ! stt $f21,232($30) ! stt $f22,240($30) ! stt $f23,248($30) ! stt $f24,256($30) ! stt $f25,264($30) ! stt $f26,272($30) ! stt $f27,280($30) ! mf_fpcr $f0 ! stt $f28,288($30) ! stt $f29,296($30) ! stt $f30,304($30) ! stt $f0,312($30) ! ldt $f0,64($30) ! ret $31,($1),1 ! .end save_switch_stack ! ! .align 3 ! .ent restore_switch_stack ! restore_switch_stack: ! ldq $9,0($30) ! ldq $10,8($30) ! ldq $11,16($30) ! ldq $12,24($30) ! ldq $13,32($30) ! ldq $14,40($30) ! ldq $15,48($30) ! ldq $26,56($30) ! ldt $f30,312($30) # get saved fpcr ! ldt $f0,64($30) ! ldt $f1,72($30) ! ldt $f2,80($30) ! ldt $f3,88($30) ! mt_fpcr $f30 # install saved fpcr ! ldt $f4,96($30) ! ldt $f5,104($30) ! ldt $f6,112($30) ! ldt $f7,120($30) ! ldt $f8,128($30) ! ldt $f9,136($30) ! ldt $f10,144($30) ! ldt $f11,152($30) ! ldt $f12,160($30) ! ldt $f13,168($30) ! ldt $f14,176($30) ! ldt $f15,184($30) ! ldt $f16,192($30) ! ldt $f17,200($30) ! ldt $f18,208($30) ! ldt $f19,216($30) ! ldt $f20,224($30) ! ldt $f21,232($30) ! ldt $f22,240($30) ! ldt $f23,248($30) ! ldt $f24,256($30) ! ldt $f25,264($30) ! ldt $f26,272($30) ! ldt $f27,280($30) ! ldt $f28,288($30) ! ldt $f29,296($30) ! ldt $f30,304($30) ! lda $30,"SWITCH_STACK_SIZE"($30) ! ret $31,($1),1 ! .end restore_switch_stack ! ! .align 3 ! .globl sys_vmadump ! .ent sys_vmadump ! sys_vmadump: ! ldgp $29,0($27) ! bsr $1,save_switch_stack ! lda $16,"SWITCH_STACK_SIZE"($30) ! jsr $26,do_vmadump ! bsr $1,restore_switch_stack ! ret $31,($26),1 ! .end sys_vmadump ! "); #endif #ifdef __sparc__ --- 1480,1595 ---- #define SWITCH_STACK_SIZE "320" int sys_vmadump(void); /* do_switch_stack/undo_switch_stack stolen from arch/alpha/kernel/entry.S */ ! asm( ! ".align 3 \n" ! ".ent save_switch_stack \n" ! "save_switch_stack: \n" ! " lda $30,-"SWITCH_STACK_SIZE"($30) \n" ! " stq $9,0($30) \n" ! " stq $10,8($30) \n" ! " stq $11,16($30) \n" ! " stq $12,24($30) \n" ! " stq $13,32($30) \n" ! " stq $14,40($30) \n" ! " stq $15,48($30) \n" ! " stq $26,56($30) \n" ! " stt $f0,64($30) \n" ! " stt $f1,72($30) \n" ! " stt $f2,80($30) \n" ! " stt $f3,88($30) \n" ! " stt $f4,96($30) \n" ! " stt $f5,104($30) \n" ! " stt $f6,112($30) \n" ! " stt $f7,120($30) \n" ! " stt $f8,128($30) \n" ! " stt $f9,136($30) \n" ! " stt $f10,144($30) \n" ! " stt $f11,152($30) \n" ! " stt $f12,160($30) \n" ! " stt $f13,168($30) \n" ! " stt $f14,176($30) \n" ! " stt $f15,184($30) \n" ! " stt $f16,192($30) \n" ! " stt $f17,200($30) \n" ! " stt $f18,208($30) \n" ! " stt $f19,216($30) \n" ! " stt $f20,224($30) \n" ! " stt $f21,232($30) \n" ! " stt $f22,240($30) \n" ! " stt $f23,248($30) \n" ! " stt $f24,256($30) \n" ! " stt $f25,264($30) \n" ! " stt $f26,272($30) \n" ! " stt $f27,280($30) \n" ! " mf_fpcr $f0 \n" ! " stt $f28,288($30) \n" ! " stt $f29,296($30) \n" ! " stt $f30,304($30) \n" ! " stt $f0,312($30) \n" ! " ldt $f0,64($30) \n" ! " ret $31,($1),1 \n" ! ".end save_switch_stack \n" ! "\n" ! ".align 3 \n" ! ".ent restore_switch_stack \n" ! "restore_switch_stack: \n" ! " ldq $9,0($30) \n" ! " ldq $10,8($30) \n" ! " ldq $11,16($30) \n" ! " ldq $12,24($30) \n" ! " ldq $13,32($30) \n" ! " ldq $14,40($30) \n" ! " ldq $15,48($30) \n" ! " ldq $26,56($30) \n" ! " ldt $f30,312($30) # get saved fpcr \n" ! " ldt $f0,64($30) \n" ! " ldt $f1,72($30) \n" ! " ldt $f2,80($30) \n" ! " ldt $f3,88($30) \n" ! " mt_fpcr $f30 # install saved fpcr \n" ! " ldt $f4,96($30) \n" ! " ldt $f5,104($30) \n" ! " ldt $f6,112($30) \n" ! " ldt $f7,120($30) \n" ! " ldt $f8,128($30) \n" ! " ldt $f9,136($30) \n" ! " ldt $f10,144($30) \n" ! " ldt $f11,152($30) \n" ! " ldt $f12,160($30) \n" ! " ldt $f13,168($30) \n" ! " ldt $f14,176($30) \n" ! " ldt $f15,184($30) \n" ! " ldt $f16,192($30) \n" ! " ldt $f17,200($30) \n" ! " ldt $f18,208($30) \n" ! " ldt $f19,216($30) \n" ! " ldt $f20,224($30) \n" ! " ldt $f21,232($30) \n" ! " ldt $f22,240($30) \n" ! " ldt $f23,248($30) \n" ! " ldt $f24,256($30) \n" ! " ldt $f25,264($30) \n" ! " ldt $f26,272($30) \n" ! " ldt $f27,280($30) \n" ! " ldt $f28,288($30) \n" ! " ldt $f29,296($30) \n" ! " ldt $f30,304($30) \n" ! " lda $30,"SWITCH_STACK_SIZE"($30) \n" ! " ret $31,($1),1 \n" ! ".end restore_switch_stack \n" ! " \n" ! ".align 3 \n" ! ".globl sys_vmadump \n" ! ".ent sys_vmadump \n" ! "sys_vmadump: \n" ! " ldgp $29,0($27) \n" ! " bsr $1,save_switch_stack \n" ! " lda $16,"SWITCH_STACK_SIZE"($30) \n" ! " jsr $26,do_vmadump \n" ! " bsr $1,restore_switch_stack \n" ! " ret $31,($26),1 \n" ! ".end sys_vmadump \n" ! ); #endif #ifdef __sparc__ *************** *** 1597,1609 **** #define STRING_1(x) #x #define STRING(x) STRING_1(x) #define LABEL(x) STRING(C_LABEL(x)) ! asm(" .align 4 ! .globl "LABEL(sys_vmadump)" ! "LABEL(sys_vmadump)": ! mov %o7, %l5 ! add %sp, 0x40, %o0 ! call "LABEL(do_vmadump)" ! mov %l5, %o7"); #endif #ifdef powerpc --- 1597,1610 ---- #define STRING_1(x) #x #define STRING(x) STRING_1(x) #define LABEL(x) STRING(C_LABEL(x)) ! asm( ! " .align 4 \n" ! " .globl "LABEL(sys_vmadump)" \n" ! LABEL(sys_vmadump)": \n" ! " mov %o7, %l5 \n" ! " add %sp, 0x40, %o0 \n" ! " call "LABEL(do_vmadump)" \n" ! " mov %l5, %o7"); #endif #ifdef powerpc *************** *** 1667,1673 **** static void * old_sys_call; int init_module(void) { printk(KERN_INFO "vmadump: %s Erik Hendriks " ! "<er...@he...>\n", get_rev("$Revision: 1.63 $")); init_rwsem(&hook_lock); --- 1668,1674 ---- static void * old_sys_call; int init_module(void) { printk(KERN_INFO "vmadump: %s Erik Hendriks " ! "<er...@he...>\n", get_rev("$Revision: 1.64 $")); init_rwsem(&hook_lock); |
From: Andrew S. <sh...@in...> - 2002-10-10 20:37:28
|
I am attempting to use bproc-3.2.1 with Mandrake 9.0 (kernel 2.4.19-16mdk and gcc 3.2) I have successfully patched and compiled Mandrake's kernel with bproc, which meant I had to do some of it by hand since the grsecurity stuff confused patch. Then I tried to compile the bproc tools and I see the following error. ghost.c:63: initializer element is not constant ghost.c:63: (near initialization for `bproc_ghost_reqs') ghost.c:63: initializer element is not constant I assume this is simply due to gcc 3.2, and I will switch to an older compiler if I have to, but if someone could suggest a simple fix I would appreciate it. Thanks, -Andrew -- Andrew Shewmaker Associate Engineer Phone: 208.526.1415 Fax: 208.526.4017 Idaho National Engineering and Environmental Laboratory 2525 Fremont Ave. Idaho Falls, ID 83415-3605 |