Re: [BProc] boot.img never comes on clustermatic

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

thank you very much Lothar.

I do not think  your patch
helped, because this is how I managed to make Clustermatic work :

since we are building a diskless cluster at the same time,
  and I was destroying floppy disks in my tries to make a
working one, and moreover I got tired of waiting 2 minutes at each boot
try, I decided to give a try to PXElinux. After some problems
(the switch allowed ethernet ports to the nodes slower
than PXE/dhcp would give up - thanks to the very speed
in boot I was looking for - had to solve it
bu setting spanning tree to fast and channel to
off, as suggested by Cisco), I managed to load directly
a phase 2 image by pxe. So everything seems to work ...

thanks anyway

ot...@tr... wrote:
> I copy down there two messages which made it work..though I am not 
> totally sure it's
> your problem
> 
> Lothar
> 
> 
> I just saw something like this on a cluster here and I fixed it.  If
> you're having the same problem, the following patch should fix it for
> you.  This only modifies the "beoserv" binary (which runs on the
> master and does things like server boot images and RARP responses) so
> you won't have to change the slave boot images or anything.
> 
> Index: mcsend.c
> ===================================================================
> RCS file: /users/hendriks/repository/beoboot/mcsend.c,v
> retrieving revision 1.25
> retrieving revision 1.26
> diff -u -r1.25 -r1.26
> --- mcsend.c    27 Aug 2002 16:25:13 -0000    1.25
> +++ mcsend.c    17 Dec 2002 21:49:13 -0000    1.26
> @@ -28,7 +28,7 @@
>  * negligence or otherwise) arising in any way out of the use of this
>  * software, even if advised of the possibility of such damage.
>  *
> - *  $Id: mcsend.c,v 1.25 2002/08/27 16:25:13 hendriks Exp $
> + *  $Id: mcsend.c,v 1.26 2002/12/17 21:49:13 hendriks Exp $
>  *--------------------------------------------------------------------*/
> #include <sys/time.h>
> #include <sys/types.h> @@ -1029,6 +1029,10 @@
>         }
>         break;
>         case SND_TIME_WAIT:
> +        if (ifc->sendok > 0 &&
> +            !FD_ISSET(ifc->fd, wset) && sender_ready(s))
> +            FD_SET(ifc->fd, wset);
> +
>         timeleft = SENDER_TIMEOUT - (now.tv_sec - s->lastuse);
>         if (timeleft <= 0) {
>             sender_discard(s);
> 
> 
> 
> you'll have to rebuilt beoboot.  Grab the source RPM (included in
> Clustermatic 3) and apply this patch to it.
> 
> You'll have to do something like:
> 
> rpm -i beoboot-....src.rpm
> 
> rpmbuild -bp /usr/src/redhat/SPECS/beoboot.spec
> 
> cd  /usr/src/redhat/BUILD/beoboot-....
> 
> patch -p1 < patchfile
> 
> make beoserv
> 
> 
> Then replace the beoserv in /usr/sbin with the one built there.  See
> local Linux guru for more help on building stuff. :)
> 
> - Erik
> 
>> Erik A. Hendriks wrote:
>>
>> >On Mon, Dec 16, 2002 at 10:05:11AM -0800, lo...@tr... wrote:
>> >  >
>> >>Well that's how it goes. Looks to me as if the problem is
>> >>on the master side....but no idea what.
>> >>    
> 
> 
> 
> 
> Lothar
> 
> 
> Florent Calvayrac wrote:
> 
>> Hi
>>
>> I gave a try to clustermatic today, on a test cluster
>> made of 8 nodes with dual pentium 3 processors,
>> serverwork chipset with integrated eepro100 , + one eepro
>> on the master, myrinet 2000 boards and switch.
>>
>> Either with fast ethernet or myrinet the nodes
>> hang on the boot.img fetch after they got an
>> IP from RARP. I tried the mcastbcast hack,
>> crossover cables instead of our Cisco switch,
>> and boot over myrinet (with different MACs )  :
>>  all the same, nothing happens
>> and tcpdump does not show anything after the RARP
>> resolution.
>>
>> I chose addresses in the 192.168.33 range in order
>> not to interfer with the campus here.
>>
>> Did I forget something in the redhat 8.0 security
>> settings ? I set no firewall and started all services...
>>
>> Please help
>>
>> thanks in advance
>>
>>
> 
> 

-- 
Florent Calvayrac                          | Tel : 02 43 83 26 26
Laboratoire de Physique de l'Etat Condense | Fax : 02 43 83 35 18
UMR-CNRS 6087         | http://www.univ-lemans.fr/~fcalvay
Universite du Maine-Faculte des Sciences   |
72085 Le Mans Cedex 9