From: Yao C. L. <yc...@as...> - 2014-07-14 09:02:19
|
Thanks. I have read SAFForum spec on imm, but have not read the one in opensaf. I just skimmed it, and will read in depth sonn. Thanks for advice. Ted -----Original Message----- From: Anders Bjornerstedt [mailto:and...@er...] Sent: Monday, July 14, 2014 4:56 PM To: Yao Cheng LIANG Cc: ope...@li...; santosh satapathy Subject: Re: [users] One of the controller wait for sync Hi, Sounds like you dont have a shared file system mounted between SC1 and SC2. That means you can not run what is called 1PBE which relies on a shared filesystem. But you van run 0PBE or 2PBE. PBE = Persistent Back End. But in any case, your initial problem of SC2 not getting synced is strange. If you have not already done so, you need to read the documentation for the IMM. Either the OpenSAF_IMMSV_PR.doc or the osaf/services/saf/immsv/README. In particular the overview parts that explain imm loading, imm sync and PBE. /Anders Bjornerstedt Yao Cheng LIANG wrote: > Dear Anders, > > Thanks for clarification. Here "sync" I mean cop the file from one node to the other. > > Ted > > -----Original Message----- > From: Anders Bjornerstedt [mailto:and...@er...] > Sent: Monday, July 14, 2014 4:48 PM > To: Yao Cheng LIANG > Cc: ope...@li...; santosh satapathy > Subject: Re: [users] One of the controller wait for sync > > Hi, > > There is no such thing as "sync the imm.xml file". > Sync is a protocol where the IMMND at one of the SCs broadcastrs the imm contents (from memory) to any nodes that are "empty" and ready to receive the sync/data. Any node that has sent a sync request is ready to receive the sync. > > An imm.cml file can be used for loading. > Sync is performaed by nodes that *missed* loading. > > (An imm.xml file can also be used to create a ccb using 'immcfg -f' but I dont think that is what you meant). > > /Anders Bjornerstedt > > Yao Cheng LIANG wrote: > >> Thanks. I resolved the issue by sync the imm.xml file on two >> controllers. /Ted >> >> -----Original Message----- >> From: Anders Bjornerstedt [mailto:and...@er...] >> Sent: Monday, July 14, 2014 4:13 PM >> To: Yao Cheng LIANG >> Cc: ope...@li...; santosh satapathy >> Subject: Re: [users] One of the controller wait for sync >> >> Hi , >> >> The sync request from SC2 clearly reaches SC1. >> Is any sync started at SC1 ? >> I cant see because the syslog snippet from SC1 is minimal, truncated right after the request arrives. >> >> /Anders Bjornerstedt >> >> Yao Cheng LIANG wrote: >> >> >>> Dear all, >>> >>> I am using OpenSAF 4.2.2, and when I start SC-2 after SC-1, below message appears in /var/log/message on sc-1: >>> ------------------------------------------------------ >>> Jul 12 22:35:26 localhost osaffmd[11690]: Peer Node_id 328207 : >>> EE_ID >>> safEE=Linux_os_hosting_clm_node,safHE=4500_slot_14,safDomain=domain_ >>> 1 Jul 12 22:35:26 localhost osafimmd[11706]: New IMMND process is on >>> STANDBY Controller at 5020f Jul 12 22:35:26 localhost osafimmd[11706]: >>> IMMND on controller (not currently coord) requests sync Jul 12 >>> 22:35:26 localhost osafimmd[11706]: Node 5020f request sync >>> sync-pid:8930 epoch:0 >>> -------------------------------------------------------------------- >>> - >>> - >>> -------------------------------------------------------- >>> >>> while on sc-2, below message appears in /var/log/message: >>> -------------------------------------------------------------------- >>> - >>> - >>> -------------------------------------------------------- >>> Jul 12 22:35:26 WR20-64_32 opensafd: Starting OpenSAF Services Jul >>> 12 >>> 22:35:26 WR20-64_32 osafdtmd[8860]: Started Jul 12 22:35:26 >>> WR20-64_32 >>> /etc/redhat-lsb/lsb_start_daemon: osafdtmd startup - OK Jul 12 >>> 22:35:26 WR20-64_32 /etc/redhat-lsb/lsb_log_message: - OK Jul 12 >>> 22:35:26 WR20-64_32 osafrded[8878]: Started >>> >>> Jul 12 22:35:26 WR20-64_32 osafrded[8878]: Started Jul 12 22:35:26 >>> WR20-64_32 /etc/redhat-lsb/lsb_start_daemon: osafrded startup - OK >>> Jul >>> 12 22:35:26 WR20-64_32 /etc/redhat-lsb/lsb_log_message: - OK Jul 12 >>> 22:35:26 WR20-64_32 osafrded[8878]: rde@5030f<mailto:rde@5030f> has >>> active state => Standby role Jul 12 22:35:26 WR20-64_32 osaffmd[8897]: >>> Started Jul 12 22:35:26 WR20-64_32 osaffmd[8897]: EE_ID : >>> safEE=Linux_os_hosting_clm_node,safHE=4500_slot_14,safDomain=domain_ >>> 1 Jul 12 22:35:26 WR20-64_32 /etc/redhat-lsb/lsb_start_daemon: >>> osaffmd startup - OK Jul 12 22:35:26 WR20-64_32 >>> /etc/redhat-lsb/lsb_log_message: - OK Jul 12 22:35:26 WR20-64_32 >>> osafimmd[8913]: Started Jul 12 22:35:26 WR20-64_32 osafimmd[8913]: >>> Initialization Success, role STANDBY Jul 12 22:35:26 WR20-64_32 >>> /etc/redhat-lsb/lsb_start_daemon: osafimmd startup - OK Jul 12 >>> 22:35:26 WR20-64_32 /etc/redhat-lsb/lsb_log_message: - OK Jul 12 >>> 22:35:26 WR20-64_32 osafimmnd[8930]: Started Jul 12 22:35:26 >>> WR20-64_32 osafimmnd[8930]: Initialization Success Jul 12 22:35:26 >>> WR20-64_32 osafimmnd[8930]: Director Service is up Jul 12 22:35:26 >>> WR20-64_32 /etc/redhat-lsb/lsb_start_daemon: osafimmnd startup - OK >>> Jul 12 22:35:26 WR20-64_32 /etc/redhat-lsb/lsb_log_message: - OK >>> Jul >>> 12 22:35:26 WR20-64_32 osafimmnd[8930]: SERVER STATE: >>> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Jul 12 22:35:26 >>> WR20-64_32 osafimmnd[8930]: SERVER STATE: IMM_SERVER_CLUSTER_WAITING >>> --> IMM_SERVER_LOADING_PENDING Jul 12 22:35:26 WR20-64_32 >>> osafimmnd[8930]: REQUESTING SYNC Jul 12 22:35:26 WR20-64_32 >>> osafimmnd[8930]: SERVER STATE: IMM_SERVER_LOADING_PENDING --> >>> IMM_SERVER_SYNC_PENDING Jul 12 22:35:26 WR20-64_32 osafimmnd[8930]: >>> NODE STATE-> IMM_NODE_ISOLATED Jul 12 22:35:46 WR20-64_32 >>> osafimmnd[8930]: This node still waiting to be sync'ed after 20 >>> seconds Jul 12 22:36:06 WR20-64_32 osafimmnd[8930]: This node still >>> waiting to be sync'ed after 40 seconds Jul 12 22:36:26 WR20-64_32 >>> osafimmnd[8930]: This node still waiting to be sync'ed after 60 >>> seconds Jul 12 22:36:46 WR20-64_32 osafimmnd[8930]: This node still >>> waiting to be sync'ed after 80 seconds Jul 12 22:37:06 WR20-64_32 >>> osafimmnd[8930]: REQUESTING SYNC AGAIN 1000 Jul 12 22:37:06 >>> WR20-64_32 >>> osafimmnd[8930]: This node still waiting to be sync'ed after 100 >>> seconds Jul 12 22:37:06 WR20-64_32 osafimmnd[8930]: Redundant sync >>> request, when IMM_NODE_ISOLATED Jul 12 22:37:16 WR20-64_32 >>> osafdtmd[8860]: DTM:dtm_comm_socket_recv() failed rc : 22 Jul 12 >>> 22:37:26 WR20-64_32 osafimmnd[8930]: This node still waiting to be >>> sync'ed after 120 seconds Jul 12 22:37:46 WR20-64_32 osafimmnd[8930]: >>> This node still waiting to be sync'ed after 140 seconds Jul 12 >>> 22:37:52 WR20-64_32 osafimmd[8913]: IMMND DOWN on active controller >>> f3 detected at standby immd!! f2. Possible failover Jul 12 22:37:52 >>> WR20-64_32 osafimmd[8913]: Resend of fevs message 1855, will not >>> mbcp to peer IMMD Jul 12 22:37:52 WR20-64_32 osafimmd[8913]: Message >>> count:1856 + 1 != 1856 Jul 12 22:38:06 WR20-64_32 osafimmnd[8930]: >>> This node still waiting to be sync'ed after 160 seconds Jul 12 >>> 22:38:26 WR20-64_32 osafimmnd[8930]: This node still waiting to be >>> sync'ed after 180 seconds Jul 12 22:38:46 WR20-64_32 osafimmnd[8930]: >>> REQUESTING SYNC AGAIN 2000 Jul 12 22:38:46 WR20-64_32 osafimmnd[8930]: >>> This node still waiting to be sync'ed after 200 seconds Jul 12 >>> 22:38:46 WR20-64_32 osafimmnd[8930]: Redundant sync request, when >>> IMM_NODE_ISOLATED Jul 12 22:38:53 WR20-64_32 osafdtmd[8860]: DTM: >>> add New incoming connection to fd : 22 Jul 12 22:39:06 WR20-64_32 >>> osafimmnd[8930]: This node still waiting to be sync'ed after 220 >>> seconds Jul 12 22:39:26 WR20-64_32 osafimmnd[8930]: This node still >>> waiting to be sync'ed after 240 seconds Jul 12 22:39:46 WR20-64_32 >>> osafimmnd[8930]: This node still waiting to be sync'ed after 260 >>> seconds Jul 12 22:40:06 WR20-64_32 osafimmnd[8930]: This node still >>> waiting to be sync'ed after 280 seconds Jul 12 22:40:26 WR20-64_32 >>> osafimmnd[8930]: REQUESTING SYNC AGAIN 3000 Jul 12 22:40:26 >>> WR20-64_32 >>> osafimmnd[8930]: This node still waiting to be sync'ed after 300 >>> seconds Jul 12 22:40:26 WR20-64_32 osafimmnd[8930]: Redundant sync >>> request, when IMM_NODE_ISOLATED >>> -------------------------------------------------------------------- >>> - >>> - >>> --------------------------------------------------------- >>> >>> But I reverse the order - i.e. to start sc-2 and then sc-2, both >>> controller can be started successfully >>> >>> Could anyone tell me what's wrong? >>> >>> Thanks. >>> >>> Ted >>> >>> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> This message (including any attachments) is for the named >>> addressee(s)'s use only. It may contain sensitive, confidential, >>> private proprietary or legally privileged information intended for a >>> specific individual and purpose, and is protected by law. If you are >>> not the intended recipient, please immediately delete it and all >>> copies of it from your system, destroy any hard copies of it and >>> notify the sender. Any use, disclosure, copying, or distribution of >>> this message and/or any attachments is strictly prohibited. >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> -------------------------------------------------------------------- >>> - >>> - >>> -------- _______________________________________________ >>> Opensaf-users mailing list >>> Ope...@li... >>> https://lists.sourceforge.net/lists/listinfo/opensaf-users >>> >>> >>> >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> This message (including any attachments) is for the named >> addressee(s)'s use only. It may contain sensitive, confidential, >> private proprietary or legally privileged information intended for a >> specific individual and purpose, and is protected by law. If you are >> not the intended recipient, please immediately delete it and all >> copies of it from your system, destroy any hard copies of it and >> notify the sender. Any use, disclosure, copying, or distribution of >> this message and/or any attachments is strictly prohibited. >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> >> > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > This message (including any attachments) is for the named > addressee(s)'s use only. It may contain sensitive, confidential, > private proprietary or legally privileged information intended for a > specific individual and purpose, and is protected by law. If you are > not the intended recipient, please immediately delete it and all > copies of it from your system, destroy any hard copies of it and > notify the sender. Any use, disclosure, copying, or distribution of > this message and/or any attachments is strictly prohibited. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This message (including any attachments) is for the named addressee(s)'s use only. It may contain sensitive, confidential, private proprietary or legally privileged information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, please immediately delete it and all copies of it from your system, destroy any hard copies of it and notify the sender. Any use, disclosure, copying, or distribution of this message and/or any attachments is strictly prohibited. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |