You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(25) |
Nov
|
Dec
(22) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(13) |
Feb
(22) |
Mar
(39) |
Apr
(10) |
May
(26) |
Jun
(23) |
Jul
(38) |
Aug
(20) |
Sep
(27) |
Oct
(76) |
Nov
(32) |
Dec
(11) |
2003 |
Jan
(8) |
Feb
(23) |
Mar
(12) |
Apr
(39) |
May
(1) |
Jun
(48) |
Jul
(35) |
Aug
(15) |
Sep
(60) |
Oct
(27) |
Nov
(9) |
Dec
(32) |
2004 |
Jan
(8) |
Feb
(16) |
Mar
(40) |
Apr
(25) |
May
(12) |
Jun
(33) |
Jul
(49) |
Aug
(39) |
Sep
(26) |
Oct
(47) |
Nov
(26) |
Dec
(36) |
2005 |
Jan
(29) |
Feb
(15) |
Mar
(22) |
Apr
(1) |
May
(8) |
Jun
(32) |
Jul
(11) |
Aug
(17) |
Sep
(9) |
Oct
(7) |
Nov
(15) |
Dec
|
From: John B. <j0n...@ya...> - 2004-12-25 13:12:57
|
PPL !! I've got LAM-MPI 7.1.1 compiled and working with Bproc support. I cud run simple MPI programs, but only by copying the executables to the slave nodes !!! This takes away the whole concept of using BProc Distributed Process Space... Is it an implementation error or something I've messed up ?? I'm also getting some MPI_Recv local process dead error when tryin out MPI Povray. I'll get back soon with the error ! Hv i got a fix ?? regards, Jon ________________________________________________________________________ Yahoo! India Matrimony: Find your life partner online Go to: http://yahoo.shaadi.com/india-matrimony |
From: Reza S. <sh...@en...> - 2004-12-25 01:20:09
|
Hello, Thanks to Steven James' advice/suggestions, we can now spawn MATLAB on a child node, though there are a few outstanding issues. The following output is obtained when matlab is run on node 2 on our Clustermatic cluster: > shahidi@controller:/usr/local/matlab6p5> > /usr/local/matlab6p5/bin/matlab2 -nosplash -nojvm > ??? MATLAB was unable to open a pseudo-tty: No such file or directory > [2,1] > The unix() and ! commands will not work in this MATLAB session. Other > commands which depend upon unix() and ! will also fail. Your system may > be running low on resources. If the problem persists after a reboot, > check with your system administrator and confirm that your pty subsystem > is properly configured. > > Warning: Unable to initialize X locale. > > < M A T L A B > > Copyright 1984-2002 The MathWorks, Inc. > Version 6.5.0.180913a Release 13 > Jun 18 2002 > > Warning: Name is nonexistent or not a directory: > /usr/local/matlab6p5/toolbox/local. > ??? Undefined function or variable 'matlabrc'. > > >> I am thinking that probably some type of filesystem on the nodes may be the best way to go at this point. The nodes each have 40 GB hard drives, though writing to memory would be possible as well. That way it would also be easier to save MATLAB data after running simulations, etc. But if anybody has any further pointers on continuing with what we have on diskless nodes, that would also be welcome. Happy holidays, Reza Steven James wrote: >Greetings, > >Usually, there is an environment variable that can be set to specify a >remote licence server. Pointing it at n-1 might help. > >G'day, >sjames > > > >On Wed, 22 Dec 2004, Reza Shahidi wrote: > > > >>Hello, >> >> Todd McAnally actually gave the advice of rewriting the matlab shell >>script so that it would use bpsh instead of exec. After adding the >>appropriate libraries to the Clustermatic config file, matlab loads, but >>then hangs because it can't access the license server. >> >> Perhaps a MATLAB mailing list/newsgroup would be the more >>appropriate forum to inquire about this. Would mounting the directory >>with the MATLAB license file on the license server onto the master node >>of the cluster help (I'm not even sure if there is a license file per >>se)? I'll talk more to our sysadmins tomorrow to see what can be done, >>but if anybody can think of any other ways around this, then that would >>also be welcome. >> >>Thanks, >> >>Reza >> >>er...@he... wrote: >> >> >> >>>On Tue, Dec 21, 2004 at 09:20:36AM -0500, Daniel Gruner wrote: >>> >>> >>> >>> >>>>I have done this, and it works. Any time you need to run a script >>>>on the nodes there are similar problems. >>>> >>>>Erik, is it possible to actually fix this, in a different way than a >>>>hack? >>>> >>>> >>>> >>>> >>>I don't think so. The issue is that when a script gets executed (lets >>>say foo.sh that uses /bin/sh), the binary that gets loaded is actually >>>/bin/sh. The string foo.sh is given as an argument. The interpreter >>>(sh) is expected to open foo.sh and do what it says. The wrinkle is >>>that the interpreter wakes up and runs on the slave node which >>>probably doesn't have foo.sh. >>> >>>Enter the elaborate hack that was simple enough to make some things >>>work. It's sounding more and more ill-advised the more I hear about >>>people's experiences with it. It violates the principle of least >>>surprise. >>> >>>Anyway, the hack basically says "ah, you're going to want this >>>script." and it loads it adds it to the memory space of the process. >>>Then BProc sets up a special file descriptor (3) which the interpreter >>>can read to get the contents of the script. /proc/self/fd/3 is given >>>as the argument so the program should get its own fd. >>> >>> >>> >>>The real problem is that stuff is missing on the slave nodes and we >>>don't usually want to export the master's entire file system with NFS >>>or something to get around this problem. >>> >>>I haven't thought of any other solution other than NFS mounting what >>>you need. That would include the script itself in this case. >>> >>>I usually try and get people writing scripts turn sort of inside out. >>>In other words the script runs on the front end and the binaries that >>>grind a long time run on the back end. That's more of a pain in the >>>butt with a canned product like matlab though. >>> >>>If somebody has an idea for a good way to deal with this, I'd love to >>>hear it. >>> >>>- Erik >>> >>> >>> >>> >>> >> >>------------------------------------------------------- >>SF email is sponsored by - The IT Product Guide >>Read honest & candid reviews on hundreds of IT Products from real users. >>Discover which products truly live up to the hype. Start reading now. >>http://productguide.itmanagersjournal.com/ >>_______________________________________________ >>BProc-users mailing list >>BPr...@li... >>https://lists.sourceforge.net/lists/listinfo/bproc-users >> >> >> > >||||| |||| ||||||||||||| ||| >by Linux Labs International, Inc. > Steven James, CTO > >55 Marietta Street >Suite 1830 >Atlanta, Ga 30303 >866 824 9737 support > > > |
From: <ha...@no...> - 2004-12-23 22:25:59
|
openMosix guys seem to have this experience with Matlab: "Any version of matlab can be made to migrate ... starting with -nojvm (disabling java). However, matlab processes will not do any useful work when migrated. My assumption is that matlab relies heavily on system calls." http://howto.ipng.be/openMosixWiki/index.php/More%20on%20Matlab so unless they hit some particular inefficiency in openMosix, this may be the case with bproc as well. Vaclav Hanzl |
From: Steven J. <py...@li...> - 2004-12-23 02:49:41
|
Greetings, Usually, there is an environment variable that can be set to specify a remote licence server. Pointing it at n-1 might help. G'day, sjames On Wed, 22 Dec 2004, Reza Shahidi wrote: > Hello, > > Todd McAnally actually gave the advice of rewriting the matlab shell > script so that it would use bpsh instead of exec. After adding the > appropriate libraries to the Clustermatic config file, matlab loads, but > then hangs because it can't access the license server. > > Perhaps a MATLAB mailing list/newsgroup would be the more > appropriate forum to inquire about this. Would mounting the directory > with the MATLAB license file on the license server onto the master node > of the cluster help (I'm not even sure if there is a license file per > se)? I'll talk more to our sysadmins tomorrow to see what can be done, > but if anybody can think of any other ways around this, then that would > also be welcome. > > Thanks, > > Reza > > er...@he... wrote: > > >On Tue, Dec 21, 2004 at 09:20:36AM -0500, Daniel Gruner wrote: > > > > > >>I have done this, and it works. Any time you need to run a script > >>on the nodes there are similar problems. > >> > >>Erik, is it possible to actually fix this, in a different way than a > >>hack? > >> > >> > > > >I don't think so. The issue is that when a script gets executed (lets > >say foo.sh that uses /bin/sh), the binary that gets loaded is actually > >/bin/sh. The string foo.sh is given as an argument. The interpreter > >(sh) is expected to open foo.sh and do what it says. The wrinkle is > >that the interpreter wakes up and runs on the slave node which > >probably doesn't have foo.sh. > > > >Enter the elaborate hack that was simple enough to make some things > >work. It's sounding more and more ill-advised the more I hear about > >people's experiences with it. It violates the principle of least > >surprise. > > > >Anyway, the hack basically says "ah, you're going to want this > >script." and it loads it adds it to the memory space of the process. > >Then BProc sets up a special file descriptor (3) which the interpreter > >can read to get the contents of the script. /proc/self/fd/3 is given > >as the argument so the program should get its own fd. > > > > > > > >The real problem is that stuff is missing on the slave nodes and we > >don't usually want to export the master's entire file system with NFS > >or something to get around this problem. > > > >I haven't thought of any other solution other than NFS mounting what > >you need. That would include the script itself in this case. > > > >I usually try and get people writing scripts turn sort of inside out. > >In other words the script runs on the front end and the binaries that > >grind a long time run on the back end. That's more of a pain in the > >butt with a canned product like matlab though. > > > >If somebody has an idea for a good way to deal with this, I'd love to > >hear it. > > > >- Erik > > > > > > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users > ||||| |||| ||||||||||||| ||| by Linux Labs International, Inc. Steven James, CTO 55 Marietta Street Suite 1830 Atlanta, Ga 30303 866 824 9737 support |
From: Steven J. <py...@li...> - 2004-12-23 02:47:55
|
On Wed, 22 Dec 2004 er...@he... wrote: > The real problem is that stuff is missing on the slave nodes and we > don't usually want to export the master's entire file system with NFS > or something to get around this problem. > > I haven't thought of any other solution other than NFS mounting what > you need. That would include the script itself in this case. > > I usually try and get people writing scripts turn sort of inside out. > In other words the script runs on the front end and the binaries that > grind a long time run on the back end. That's more of a pain in the > butt with a canned product like matlab though. > > If somebody has an idea for a good way to deal with this, I'd love to > hear it. > > - Erik > I don't know how GOOD it is, but I've been playing with a bash hack for bash with bproc. When bash finds itself on a slave, it immediatly sets the ONNODE env variable to the node number and bproc_move back to the master. Any program the script runs is run on $ONNODE as are file access tests and even tab completion (such that export ONNODE=<n> gives the illusion of being 'logged in' to <n>). In some ways, it seems nice, but it also feels quite hackish and only helps bash scripts. G'day, sjames ||||| |||| ||||||||||||| ||| by Linux Labs International, Inc. Steven James, CTO 55 Marietta Street Suite 1830 Atlanta, Ga 30303 866 824 9737 support |
From: Reza S. <sh...@en...> - 2004-12-23 02:15:22
|
Hello, Todd McAnally actually gave the advice of rewriting the matlab shell script so that it would use bpsh instead of exec. After adding the appropriate libraries to the Clustermatic config file, matlab loads, but then hangs because it can't access the license server. Perhaps a MATLAB mailing list/newsgroup would be the more appropriate forum to inquire about this. Would mounting the directory with the MATLAB license file on the license server onto the master node of the cluster help (I'm not even sure if there is a license file per se)? I'll talk more to our sysadmins tomorrow to see what can be done, but if anybody can think of any other ways around this, then that would also be welcome. Thanks, Reza er...@he... wrote: >On Tue, Dec 21, 2004 at 09:20:36AM -0500, Daniel Gruner wrote: > > >>I have done this, and it works. Any time you need to run a script >>on the nodes there are similar problems. >> >>Erik, is it possible to actually fix this, in a different way than a >>hack? >> >> > >I don't think so. The issue is that when a script gets executed (lets >say foo.sh that uses /bin/sh), the binary that gets loaded is actually >/bin/sh. The string foo.sh is given as an argument. The interpreter >(sh) is expected to open foo.sh and do what it says. The wrinkle is >that the interpreter wakes up and runs on the slave node which >probably doesn't have foo.sh. > >Enter the elaborate hack that was simple enough to make some things >work. It's sounding more and more ill-advised the more I hear about >people's experiences with it. It violates the principle of least >surprise. > >Anyway, the hack basically says "ah, you're going to want this >script." and it loads it adds it to the memory space of the process. >Then BProc sets up a special file descriptor (3) which the interpreter >can read to get the contents of the script. /proc/self/fd/3 is given >as the argument so the program should get its own fd. > > > >The real problem is that stuff is missing on the slave nodes and we >don't usually want to export the master's entire file system with NFS >or something to get around this problem. > >I haven't thought of any other solution other than NFS mounting what >you need. That would include the script itself in this case. > >I usually try and get people writing scripts turn sort of inside out. >In other words the script runs on the front end and the binaries that >grind a long time run on the back end. That's more of a pain in the >butt with a canned product like matlab though. > >If somebody has an idea for a good way to deal with this, I'd love to >hear it. > >- Erik > > > |
From: <er...@he...> - 2004-12-23 01:59:16
|
On Tue, Dec 21, 2004 at 09:20:36AM -0500, Daniel Gruner wrote: > > I have done this, and it works. Any time you need to run a script > on the nodes there are similar problems. > > Erik, is it possible to actually fix this, in a different way than a > hack? I don't think so. The issue is that when a script gets executed (lets say foo.sh that uses /bin/sh), the binary that gets loaded is actually /bin/sh. The string foo.sh is given as an argument. The interpreter (sh) is expected to open foo.sh and do what it says. The wrinkle is that the interpreter wakes up and runs on the slave node which probably doesn't have foo.sh. Enter the elaborate hack that was simple enough to make some things work. It's sounding more and more ill-advised the more I hear about people's experiences with it. It violates the principle of least surprise. Anyway, the hack basically says "ah, you're going to want this script." and it loads it adds it to the memory space of the process. Then BProc sets up a special file descriptor (3) which the interpreter can read to get the contents of the script. /proc/self/fd/3 is given as the argument so the program should get its own fd. The real problem is that stuff is missing on the slave nodes and we don't usually want to export the master's entire file system with NFS or something to get around this problem. I haven't thought of any other solution other than NFS mounting what you need. That would include the script itself in this case. I usually try and get people writing scripts turn sort of inside out. In other words the script runs on the front end and the binaries that grind a long time run on the back end. That's more of a pain in the butt with a canned product like matlab though. If somebody has an idea for a good way to deal with this, I'd love to hear it. - Erik |
From: Daniel G. <dg...@cp...> - 2004-12-21 14:20:55
|
Hi, Here is a snippet from an old posting by Erik: > > annwn:~> bpsh 0 tst > /proc/self/fd/3 > > Uh ? > Something is missing in rfork/rexec to set up properly script names ? Hehe. You've discovered the wacky shell script hack that I put in a while ago. The problem with shell scripts is that then the kernel sees '#!/bin/sh' in a script called X it actually does execve("/bin/sh", "X", 0). X isn't going to exist on the nodes most of the time. This is true in my world anyway. BProc, in an attempt to be tricky and get around this, puts the script in the process's memory space and then gives you this file descriptor on fd 3 that magically just reads the file from your own memory space. This made a few perl users very happy a while ago. If you don't want this hack, put a zero in /proc/sys/bproc/shell_hack. - Erik I have done this, and it works. Any time you need to run a script on the nodes there are similar problems. Erik, is it possible to actually fix this, in a different way than a hack? Regards, Daniel On Tue, Dec 21, 2004 at 09:29:53AM -0330, Reza Shahidi wrote: > Hello, > > I guess the only problem is that all my children nodes are diskless. > We have got tempfs installed, but I don't know if that would be big > enough to load MATLAB. My nodes do have hard drives, but we are just > not using them, so maybe we should. The default with Clustermatic is > diskless nodes. Ideally, it would be better to be able to run MATLAB > with our current setup. > > Thanks, > > Reza > > Todd McAnally wrote: > > > I've seen something similar. I haven't seen exactly what you're > > seeing and it wasn't with matlab. > > > > I got something similar to the first error when trying to run a shell > > script and something in the script couldn't access what it needed to > > on the node. Is the matlab that it's finding in your path an > > executable or a script wrapper that runs the real executable? If it's > > a wrapper then the shell is loaded on the master and migrated. Any > > executables it tries to load would be from the node. In that case the > > real executable and all dependent libraries need to be accessible from > > the node. > > > > I've seen the second error message when the executable you're trying > > to load and/or all dependent libraries isn't accessible from the > > node. If this is indeed what you want to do then either copy it over > > or make it available via a local or nfs disk. Run "ldd <executable > > name>" to find out all of the dependent libraries. > > > > Hope this helps. > > > > Todd > > > > > > On Dec 20, 2004, at 7:17 PM, Reza Shahidi wrote: > > > >> Hello, > >> > >> I have been having trouble getting MATLAB working with BProc. To > >> start off simple, I just want to be able to run matlab using bpsh > >> just on one node. FYI, we are using Clustermatic. When I type "bpsh > >> 0 matlab", I get the following error message: > >> > >>> /proc/self/fd/3: /proc/self/fd/3: Permission denied > >> > >> I have looked over some previous posts, and one has suggested > >> using the full path to MATLAB. This doesn't make any difference to > >> what is output. Another post suggests running MATLAB through bash or > >> a similar shell. This gives: > >> > >>> shahidi@controller:~> bpsh 0 bash matlab > >>> matlab: matlab: No such file or directory > >> > >> > >> > >> The last output above occurs with any command, not just matlab. > >> Has anybody else encountered this problem? Can it be solved? > >> > >> Thanks, > >> > >> Reza > >> > >> > >> ------------------------------------------------------- > >> SF email is sponsored by - The IT Product Guide > >> Read honest & candid reviews on hundreds of IT Products from real users. > >> Discover which products truly live up to the hype. Start reading now. > >> http://productguide.itmanagersjournal.com/ > >> _______________________________________________ > >> BProc-users mailing list > >> BPr...@li... > >> https://lists.sourceforge.net/lists/listinfo/bproc-users > > > > > > > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT Products from real users. > > Discover which products truly live up to the hype. Start reading now. > > http://productguide.itmanagersjournal.com/ > > _______________________________________________ > > BProc-users mailing list > > BPr...@li... > > https://lists.sourceforge.net/lists/listinfo/bproc-users > > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users -- Dr. Daniel Gruner dg...@ch... Dept. of Chemistry dan...@ut... University of Toronto phone: (416)-978-8689 80 St. George Street fax: (416)-978-5325 Toronto, ON M5S 3H6, Canada finger for PGP public key |
From: Todd M. <to...@ac...> - 2004-12-21 13:52:39
|
I don't know what the script does. I assume general setup and config stuff. You may want to modify it ( a copy of it :-) ) so you give a node to the script and it's calls the bpsh. That way it will pick up the libraries while still on the master. Todd On Dec 21, 2004, at 8:44 AM, Reza Shahidi wrote: > Hello, > > "matlab" is a script, and runs another "matlab" in a glnx86 > directory, which requires a lot of libraries. We don't have NFS > working for the cluster node hard drives, but that may be necessary. > I'm just a user, so I have to ask our sys-admin to do most non-trivial > tasks on our cluster. > > Thank you very much for all of your help. > > Best wishes, > > Reza > > Todd McAnally wrote: > >> What about NFS? Did you find that the "matlab" in your path is a >> script? I don't have matlab so I can't check. >> >> Todd >> >> >> On Dec 21, 2004, at 7:59 AM, Reza Shahidi wrote: >> >>> Hello, >>> >>> I guess the only problem is that all my children nodes are >>> diskless. We have got tempfs installed, but I don't know if that >>> would be big enough to load MATLAB. My nodes do have hard drives, >>> but we are just not using them, so maybe we should. The default >>> with Clustermatic is diskless nodes. Ideally, it would be better to >>> be able to run MATLAB with our current setup. >>> >>> Thanks, >>> >>> Reza >>> >>> Todd McAnally wrote: >>> >>>> I've seen something similar. I haven't seen exactly what you're >>>> seeing and it wasn't with matlab. >>>> >>>> I got something similar to the first error when trying to run a >>>> shell script and something in the script couldn't access what it >>>> needed to on the node. Is the matlab that it's finding in your >>>> path an executable or a script wrapper that runs the real >>>> executable? If it's a wrapper then the shell is loaded on the >>>> master and migrated. Any executables it tries to load would be >>>> from the node. In that case the real executable and all dependent >>>> libraries need to be accessible from the node. >>>> >>>> I've seen the second error message when the executable you're >>>> trying to load and/or all dependent libraries isn't accessible from >>>> the node. If this is indeed what you want to do then either copy >>>> it over or make it available via a local or nfs disk. Run "ldd >>>> <executable name>" to find out all of the dependent libraries. >>>> >>>> Hope this helps. >>>> >>>> Todd >>>> >>>> >>>> On Dec 20, 2004, at 7:17 PM, Reza Shahidi wrote: >>>> >>>>> Hello, >>>>> >>>>> I have been having trouble getting MATLAB working with BProc. >>>>> To start off simple, I just want to be able to run matlab using >>>>> bpsh just on one node. FYI, we are using Clustermatic. When I >>>>> type "bpsh 0 matlab", I get the following error message: >>>>> >>>>>> /proc/self/fd/3: /proc/self/fd/3: Permission denied >>>>> >>>>> >>>>> I have looked over some previous posts, and one has suggested >>>>> using the full path to MATLAB. This doesn't make any difference >>>>> to what is output. Another post suggests running MATLAB through >>>>> bash or a similar shell. This gives: >>>>> >>>>>> shahidi@controller:~> bpsh 0 bash matlab >>>>>> matlab: matlab: No such file or directory >>>>> >>>>> >>>>> >>>>> >>>>> The last output above occurs with any command, not just matlab. >>>>> Has anybody else encountered this problem? Can it be solved? >>>>> >>>>> Thanks, >>>>> >>>>> Reza >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> SF email is sponsored by - The IT Product Guide >>>>> Read honest & candid reviews on hundreds of IT Products from real >>>>> users. >>>>> Discover which products truly live up to the hype. Start reading >>>>> now. http://productguide.itmanagersjournal.com/ >>>>> _______________________________________________ >>>>> BProc-users mailing list >>>>> BPr...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/bproc-users >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> SF email is sponsored by - The IT Product Guide >>>> Read honest & candid reviews on hundreds of IT Products from real >>>> users. >>>> Discover which products truly live up to the hype. Start reading >>>> now. http://productguide.itmanagersjournal.com/ >>>> _______________________________________________ >>>> BProc-users mailing list >>>> BPr...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bproc-users >>>> >>> >>> >>> >>> ------------------------------------------------------- >>> SF email is sponsored by - The IT Product Guide >>> Read honest & candid reviews on hundreds of IT Products from real >>> users. >>> Discover which products truly live up to the hype. Start reading >>> now. http://productguide.itmanagersjournal.com/ >>> _______________________________________________ >>> BProc-users mailing list >>> BPr...@li... >>> https://lists.sourceforge.net/lists/listinfo/bproc-users >> >> > |
From: Reza S. <sh...@en...> - 2004-12-21 12:59:09
|
Hello, I guess the only problem is that all my children nodes are diskless. We have got tempfs installed, but I don't know if that would be big enough to load MATLAB. My nodes do have hard drives, but we are just not using them, so maybe we should. The default with Clustermatic is diskless nodes. Ideally, it would be better to be able to run MATLAB with our current setup. Thanks, Reza Todd McAnally wrote: > I've seen something similar. I haven't seen exactly what you're > seeing and it wasn't with matlab. > > I got something similar to the first error when trying to run a shell > script and something in the script couldn't access what it needed to > on the node. Is the matlab that it's finding in your path an > executable or a script wrapper that runs the real executable? If it's > a wrapper then the shell is loaded on the master and migrated. Any > executables it tries to load would be from the node. In that case the > real executable and all dependent libraries need to be accessible from > the node. > > I've seen the second error message when the executable you're trying > to load and/or all dependent libraries isn't accessible from the > node. If this is indeed what you want to do then either copy it over > or make it available via a local or nfs disk. Run "ldd <executable > name>" to find out all of the dependent libraries. > > Hope this helps. > > Todd > > > On Dec 20, 2004, at 7:17 PM, Reza Shahidi wrote: > >> Hello, >> >> I have been having trouble getting MATLAB working with BProc. To >> start off simple, I just want to be able to run matlab using bpsh >> just on one node. FYI, we are using Clustermatic. When I type "bpsh >> 0 matlab", I get the following error message: >> >>> /proc/self/fd/3: /proc/self/fd/3: Permission denied >> >> I have looked over some previous posts, and one has suggested >> using the full path to MATLAB. This doesn't make any difference to >> what is output. Another post suggests running MATLAB through bash or >> a similar shell. This gives: >> >>> shahidi@controller:~> bpsh 0 bash matlab >>> matlab: matlab: No such file or directory >> >> >> >> The last output above occurs with any command, not just matlab. >> Has anybody else encountered this problem? Can it be solved? >> >> Thanks, >> >> Reza >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://productguide.itmanagersjournal.com/ >> _______________________________________________ >> BProc-users mailing list >> BPr...@li... >> https://lists.sourceforge.net/lists/listinfo/bproc-users > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users > |
From: Todd M. <to...@ac...> - 2004-12-21 05:16:55
|
I've seen something similar. I haven't seen exactly what you're seeing and it wasn't with matlab. I got something similar to the first error when trying to run a shell script and something in the script couldn't access what it needed to on the node. Is the matlab that it's finding in your path an executable or a script wrapper that runs the real executable? If it's a wrapper then the shell is loaded on the master and migrated. Any executables it tries to load would be from the node. In that case the real executable and all dependent libraries need to be accessible from the node. I've seen the second error message when the executable you're trying to load and/or all dependent libraries isn't accessible from the node. If this is indeed what you want to do then either copy it over or make it available via a local or nfs disk. Run "ldd <executable name>" to find out all of the dependent libraries. Hope this helps. Todd On Dec 20, 2004, at 7:17 PM, Reza Shahidi wrote: > Hello, > > I have been having trouble getting MATLAB working with BProc. To > start off simple, I just want to be able to run matlab using bpsh just > on one node. FYI, we are using Clustermatic. When I type "bpsh 0 > matlab", I get the following error message: > >> /proc/self/fd/3: /proc/self/fd/3: Permission denied > I have looked over some previous posts, and one has suggested using > the full path to MATLAB. This doesn't make any difference to what is > output. Another post suggests running MATLAB through bash or a > similar shell. This gives: > >> shahidi@controller:~> bpsh 0 bash matlab >> matlab: matlab: No such file or directory > > > The last output above occurs with any command, not just matlab. > Has anybody else encountered this problem? Can it be solved? > > Thanks, > > Reza > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real > users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users |
From: Reza S. <sh...@en...> - 2004-12-21 00:17:22
|
Hello, I have been having trouble getting MATLAB working with BProc. To start off simple, I just want to be able to run matlab using bpsh just on one node. FYI, we are using Clustermatic. When I type "bpsh 0 matlab", I get the following error message: > /proc/self/fd/3: /proc/self/fd/3: Permission denied I have looked over some previous posts, and one has suggested using the full path to MATLAB. This doesn't make any difference to what is output. Another post suggests running MATLAB through bash or a similar shell. This gives: > shahidi@controller:~> bpsh 0 bash matlab > matlab: matlab: No such file or directory The last output above occurs with any command, not just matlab. Has anybody else encountered this problem? Can it be solved? Thanks, Reza |
From: Michal J. <mi...@ha...> - 2004-12-16 19:31:10
|
On Thu, Dec 16, 2004 at 01:16:49PM +0000, John Bonn wrote: > > Now more doubts --> > > 1) Is there any mechanism for sending across the > libraries during runtime, than specifyin it config and > booting up everytime ??? Sure there is. First of all you have 'bcp' and the second the whole slew of Unix/Linux tools to your disposal. In particular look in archives for my postings from August 20th and October 16th with a subject "Note on modules on nodes .... ". This is using NFS as a particular example but the same mechanisms can be used in all over the place. Notes show how to execute desired scripts "automagically" on a startup but nothing really prevents you from running someting like that in any moment (even typing all commands directly from a keyboard). > Having some probs with NFS right now... Not gettin it > ready to work. So read these my postings again. NFS is not without faults and limitations but it works if approached properly. Make sure that you are not using v2 with small read and write buffers. Things need to be configured. Michal |
From: Ted S. <tsa...@cr...> - 2004-12-16 19:30:20
|
hi, Recently I installed CM5 on an Opteron cluster and it works great. The problem I have is that if there in the code is a call to mpi_wtime I get : undefined reference to `mpi_wtime_'. I did a search for the module in all libraries under mpich/lib but it is not there. Does anybody else have a similar problem? Below is a simple code that runs fine on other cluster but fails with mpich from CM5. Thanks, Ted [kenzakow@xtreme101 ~]$ cat hello.f program main include 'mpif.h' integer:: ierr real*8:: time_proc call mpi_init(ierr) time_proc= mpi_wtime() stop end program [kenzakow@xtreme101 ~]$ mpif90 hello.f hello.o(.text+0x1e): In function `MAIN_': : undefined reference to `mpi_wtime_' |
From: John B. <j0n...@ya...> - 2004-12-16 13:16:59
|
Hi folks... Successfully managed to get bproc running and now fixed LAM-MPI too ! Workin very cool... Things to take notice - 1) Set proper permissions for the user who is performing lamboot, bpctl -S xx -m 111 2) While preparing the bhost file, add the master node ip/hostname on the top. I had problems otherwise.. Now more doubts --> 1) Is there any mechanism for sending across the libraries during runtime, than specifyin it config and booting up everytime ??? 2) Good recommendation for network FS ? What happened to V9FS and stuff ? Checked out their dev page, client yet to be released (??). Any test results or other good suggestions ?? Having some probs with NFS right now... Not gettin it ready to work. Jus added the modules (nfs.ko, sunrpc.ko and lockd.ko) using bootmodule param in config.boot. When trying to bootup, it is getting stuck with no much detail in the /var/log ! Any idea ?? I feel NFS wud be really heavy on the network !! I wud go for other suggestions... Regards... Jon ________________________________________________________________________ Yahoo! India Matrimony: Find your life partner online Go to: http://yahoo.shaadi.com/india-matrimony |
From: Joshua B. <bj...@en...> - 2004-12-14 18:58:27
|
On Dec 14, 2004, at 11:54 AM, Chris Sideroff wrote: > > Quoting Joshua Bernstein <bj...@en...>: > >> Your best bet is to compilie Fluent using the Bproc based MPI >> routines. Then yoyu won't have to fight with it and hack it together >> as >> much. > > Am I missing something ... ? How could I do this when the source > code is not > available. > My mistake. I forgot for a second you could get the libraries you needed :-) >> However, you may also need to all your hosts to a nodes file in >> the MPI directory. I have a machines.LINUX file that I had to use to >> get the non-bproced MPI routines to recognize the nodes. My file looks >> like: >> >> ... >> .36:2 >> .37:2 >> .38:2 >> .39:2 >> ... >> >> Judging from your links I would guess or try dropping this file in >> /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/share >> if share isn't there, go ahead and create it. Usually this file should >> be in $MPI_HOME/share. > > I did this without any change to my problems. > > There is one more thing I forgot to ask before but do I need to load > the MPICH > libraries on to the compute nodes? Right now there are no MPICH libs > loaded or > in the config file to be loaded on the compute nodes - I should do > this? Sorry > for newb question. They don't actually have to be there they just have to be available to the compute nodes. Especially if your version of Fluent is dynamically linked. Try running ldd fluent, and see if you see any MPI libraries listed. Make sure the path to those libraries is listed in the libraries line of your config file. Also, try looking (or even posting me) the output of strace <fluent command line> That should show you whats going on and perhaps whats being looked for by fluent and MPI. -Josh |
From: Joshua B. <bj...@en...> - 2004-12-14 18:33:03
|
Chris, Your best bet is to compilie Fluent using the Bproc based MPI routines. Then yoyu won't have to fight with it and hack it together as much. However, you may also need to all your hosts to a nodes file in the MPI directory. I have a machines.LINUX file that I had to use to get the non-bproced MPI routines to recognize the nodes. My file looks like: ... .36:2 .37:2 .38:2 .39:2 ... Judging from your links I would guess or try dropping this file in /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/share if share isn't there, go ahead and create it. Usually this file should be in $MPI_HOME/share. Hope that helps a bit -Joshua Bernstein On Dec 14, 2004, at 8:54 AM, Chris Sideroff wrote: > > Thanks for the reply. This seems to have helped a little but I'm > still having > difficulty. If you don't mind I tell you what I did and whats going > wrong still. > > I set the P4_RSHCOMMAND variable to bpsh as you mentioned. Fluent > comes with > it's own version of MPICH so I made a soft link to the /bin and /lib > dirs of the > bproc MPICH. First I rename the Fluent MPICH and then making the soft > links: > > [root ~]# mv /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/bin > /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/bin_orig > [root ~]# mv /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/lib > /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/bin_orig > [root ~]# ln -s /usr/mpich-p4/bin > /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/bin > [root ~]# mv /usr/mpich-p4/lib > /opt/Fluent.Inc/fluent6.2.9/multiport/lnx86/nmpi/lib > > I then 'export P4_RSHCOMMAND=bpsh' as a normal user and start Fluent > like so: > > [user ~]# fluent 2ddp -t2 -pnmpi > > and get the following error: > > Invalid number of processes: 4 > > Any other ideas or tips you know of? Thanks for your help, Chris > > > Quoting Joshua Bernstein <bj...@en...>: > >> I've had this same problem running MCNPX (from LANL). MPICH supports >> the enviroment varible P4_RSHCOMMAND. You can do something like: >> >> export P4_RSHCOMMAND=bpsh >> >> Hope that helps. >> >> -Joshua Bernstein >> Systems Analyst >> University of Arizona >> Tucson, Arizona, USA >> On Dec 13, 2004, at 9:00 PM, bpr...@li... >> wrote: >> >>> >>> Message: 1 >>> From: Chris Sideroff <cns...@sy...> >>> To: "bpr...@li..." >>> <bpr...@li...> >>> Date: Sun, 12 Dec 2004 23:03:14 -0500 >>> Subject: [BProc] running fluent under bproc >>> >>> I have a small (5) cluster of dual P3 that I've managed to get bproc >>> running on. I want to run the CFD application Fluent on it. Fluent >>> comes with MPICH 1.2.4 so I didn't think it would to be to difficult >> to >>> get it to use the bproc modified MPICH. Unfortunately, the Fluent >>> start >>> script wants to use 'rsh' when launching the mpi version - which of >>> course it can't do without rsh ;-). Has anyone had any experiences >>> they >>> would like to share at getting Fluent to work with a bproc cluster? >>> >>> Chris >> >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real >> users. >> Discover which products truly live up to the hype. Start reading now. >> http://productguide.itmanagersjournal.com/ >> _______________________________________________ >> BProc-users mailing list >> BPr...@li... >> https://lists.sourceforge.net/lists/listinfo/bproc-users >> > |
From: John B. <j0n...@ya...> - 2004-12-14 13:35:27
|
Hi folks ! I was tryin to get Clustermatic 5 up and running.. First tried it on Fedora Core 1. It had compatibility problems - making the initrd image and probing the right modules - prolly, due to the default 2.4.x kernels. Changed over to Fedora Core 2 and got the thing installed and running in the master properly. Over to the client side, more problems. I've done PXE in some systems and Etherboot in few other. Phase 2 image is loading fine. When tried loading Phase 1 earlier, it showed - Failed to load module "kmonte". Back to phase 2, the RARP request is comin in and IP address is assigned (Noticed one problem - the MAC address have to be specified in small letters in /etc/clustermatic/config - it didnt boot up otherwise!!). After that, once bpslave is running, it is assigned the PID and while connecting to the server, it is reporting - (connect:10.0.0.10:2223): Invalid argument. When i tried running bpslave from the master node, the connection was established and the master is able to control and run all task on the "slave". Not from remote !! Any suggestions ?? Also how stable is LAM/MPI on BPROC ?? Regards, j0nyb0ny ________________________________________________________________________ Yahoo! India Matrimony: Find your life partner online Go to: http://yahoo.shaadi.com/india-matrimony |
From: Joshua B. <bj...@en...> - 2004-12-14 06:13:36
|
I've had this same problem running MCNPX (from LANL). MPICH supports the enviroment varible P4_RSHCOMMAND. You can do something like: export P4_RSHCOMMAND=bpsh Hope that helps. -Joshua Bernstein Systems Analyst University of Arizona Tucson, Arizona, USA On Dec 13, 2004, at 9:00 PM, bpr...@li... wrote: > > Message: 1 > From: Chris Sideroff <cns...@sy...> > To: "bpr...@li..." > <bpr...@li...> > Date: Sun, 12 Dec 2004 23:03:14 -0500 > Subject: [BProc] running fluent under bproc > > I have a small (5) cluster of dual P3 that I've managed to get bproc > running on. I want to run the CFD application Fluent on it. Fluent > comes with MPICH 1.2.4 so I didn't think it would to be to difficult to > get it to use the bproc modified MPICH. Unfortunately, the Fluent > start > script wants to use 'rsh' when launching the mpi version - which of > course it can't do without rsh ;-). Has anyone had any experiences > they > would like to share at getting Fluent to work with a bproc cluster? > > Chris |
From: Steven J. <py...@li...> - 2004-12-13 18:54:02
|
Greetings, I meant to answer that! :-) Do bpctl -S allup -m 0111 The mode, uid, and gid work like they would for a file except that only x is meaningful. G'day, sjames On Mon, 13 Dec 2004, Chris Sideroff wrote: > > I have a newbie question about setting up bproc. When I try to use bpsh as a > non-root users I get the following error: > > [user ~]$ bpsh n0 hostname > 0: Operation not permitted > > How (or where) do I change the permissions to allow normal users access? > > THanks > > Quoting Steven James <py...@li...>: > > > Greetings, > > > > I recently did a patch to bpsh so it can be softlinked as rsh and make > > Fluent (and others) work. It's been included in the latest version. > > > > G'day, > > sjames > > > > On Sun, 12 Dec 2004, Chris Sideroff wrote: > > > > > I have a small (5) cluster of dual P3 that I've managed to get bproc > > > running on. I want to run the CFD application Fluent on it. Fluent > > > comes with MPICH 1.2.4 so I didn't think it would to be to difficult > > to > > > get it to use the bproc modified MPICH. Unfortunately, the Fluent > > start > > > script wants to use 'rsh' when launching the mpi version - which of > > > course it can't do without rsh ;-). Has anyone had any experiences > > they > > > would like to share at getting Fluent to work with a bproc cluster? > > > > > > Chris > > > > > > > > > > > > ------------------------------------------------------- > > > SF email is sponsored by - The IT Product Guide > > > Read honest & candid reviews on hundreds of IT Products from real > > users. > > > Discover which products truly live up to the hype. Start reading now. > > > http://productguide.itmanagersjournal.com/ > > > _______________________________________________ > > > BProc-users mailing list > > > BPr...@li... > > > https://lists.sourceforge.net/lists/listinfo/bproc-users > > > > > > > ||||| |||| ||||||||||||| ||| > > by Linux Labs International, Inc. > > Steven James, CTO > > > > 55 Marietta Street > > Suite 1830 > > Atlanta, Ga 30303 > > 866 824 9737 support > > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users > ||||| |||| ||||||||||||| ||| by Linux Labs International, Inc. Steven James, CTO 55 Marietta Street Suite 1830 Atlanta, Ga 30303 866 824 9737 support |
From: Chris S. <cns...@ec...> - 2004-12-13 18:34:16
|
I have a newbie question about setting up bproc. When I try to use bpsh as a non-root users I get the following error: [user ~]$ bpsh n0 hostname 0: Operation not permitted How (or where) do I change the permissions to allow normal users access? THanks Quoting Steven James <py...@li...>: > Greetings, > > I recently did a patch to bpsh so it can be softlinked as rsh and make > Fluent (and others) work. It's been included in the latest version. > > G'day, > sjames > > On Sun, 12 Dec 2004, Chris Sideroff wrote: > > > I have a small (5) cluster of dual P3 that I've managed to get bproc > > running on. I want to run the CFD application Fluent on it. Fluent > > comes with MPICH 1.2.4 so I didn't think it would to be to difficult > to > > get it to use the bproc modified MPICH. Unfortunately, the Fluent > start > > script wants to use 'rsh' when launching the mpi version - which of > > course it can't do without rsh ;-). Has anyone had any experiences > they > > would like to share at getting Fluent to work with a bproc cluster? > > > > Chris > > > > > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT Products from real > users. > > Discover which products truly live up to the hype. Start reading now. > > http://productguide.itmanagersjournal.com/ > > _______________________________________________ > > BProc-users mailing list > > BPr...@li... > > https://lists.sourceforge.net/lists/listinfo/bproc-users > > > > ||||| |||| ||||||||||||| ||| > by Linux Labs International, Inc. > Steven James, CTO > > 55 Marietta Street > Suite 1830 > Atlanta, Ga 30303 > 866 824 9737 support > |
From: Steven J. <py...@li...> - 2004-12-13 05:44:13
|
Greetings, I recently did a patch to bpsh so it can be softlinked as rsh and make Fluent (and others) work. It's been included in the latest version. G'day, sjames On Sun, 12 Dec 2004, Chris Sideroff wrote: > I have a small (5) cluster of dual P3 that I've managed to get bproc > running on. I want to run the CFD application Fluent on it. Fluent > comes with MPICH 1.2.4 so I didn't think it would to be to difficult to > get it to use the bproc modified MPICH. Unfortunately, the Fluent start > script wants to use 'rsh' when launching the mpi version - which of > course it can't do without rsh ;-). Has anyone had any experiences they > would like to share at getting Fluent to work with a bproc cluster? > > Chris > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users > ||||| |||| ||||||||||||| ||| by Linux Labs International, Inc. Steven James, CTO 55 Marietta Street Suite 1830 Atlanta, Ga 30303 866 824 9737 support |
From: Chris S. <cns...@sy...> - 2004-12-13 04:03:19
|
I have a small (5) cluster of dual P3 that I've managed to get bproc running on. I want to run the CFD application Fluent on it. Fluent comes with MPICH 1.2.4 so I didn't think it would to be to difficult to get it to use the bproc modified MPICH. Unfortunately, the Fluent start script wants to use 'rsh' when launching the mpi version - which of course it can't do without rsh ;-). Has anyone had any experiences they would like to share at getting Fluent to work with a bproc cluster? Chris |
From: Rene S. <rs...@tu...> - 2004-12-11 17:19:23
|
Hi, Does anyone know if the Maui scheduler can be used with bjs? I found some postings on clubmask http://clubmask.sourceforge.net/ which uses bproc and maui but I don't think it uses bjs. We need a queuing system for our cluster that has some sort of preemption, and job priorities. Basically we need to have an Express queue that should preempt all other jobs running on the cluster so that any job submitted to the Express queue should not have to wait. We use PBSPro to to some of these things on our SGI machines but as far as I can tell it does not run on BProc machines. I am trying to get an idea of what queuing systems people are using on Bproc clusters so I can see what our options are. Thanks for any input/comments on this. Rene |
From: <ha...@no...> - 2004-12-01 12:41:42
|
> > I'd be curious to know if 32-bit migration with a 64-bit kernel works > > in the Open Source version... > > It does. > > If the Scyld one is not open source they're in violation of the > license on my code. My code (which forms the basis of their code) is > GPLed. Their version must be GPLed as well. Scyld used to work with some GPLed programs in an unusual yet still legal way: They distributed modified versions to their customers only and the customers did not use their right to distribute it further. (I hope their customers really did/do have this right and there is not any indirect legal trick which could be used to defeat GPL this way.) Vaclav Hanzl |