From: Jakub Kruszona-Z. <jak...@ge...> - 2018-06-04 06:55:49
|
> On 4 Jun, 2018, at 8:30, Gandalf Corvotempesta <gan...@gm...> wrote: > > Il lun 4 giu 2018, 07:31 Jakub Kruszona-Zawadzki <jak...@ge... <mailto:jak...@ge...>> ha scritto: > It is ok. You should have one ELECT after LEADER death. ELECT should become LEADER quickly after that. > If it doesn't work then it looks like that you have some configuration problems (number of working chunkservers vs number of known chunkservers). > > I have a 3 servers cluster (it's a test but the hardware is what i'll put in production) It's ok. > > In these 3 servers i'm running master and chunkservers, thus, i have 3 chunkservers ok. > > I've not checked for any disconnected chunkserver but it should block after 2 disconnected chunkservers, right? yes (if the total number of known chunkserver is 3 or 4). > In a 3 nodes cluster, quorum is met at 2, so it should survive at 1 chunkserver failure and i'm pretty sure that i don't have 2 chunkservers down during the master switch Yes. I've seen in your log three chunkservers connected to ELECT - this is really strange. Could you please send us some screenshots from your CGI? As I understand you have three masters and three chunkservers on the same machines and ELECT is not becoming LEADER for a long time (minutes) after killing LEADER, but when you stop everything and start again then you have LEADER? > > Anyway, is an odd number of metadata servers/chunkservers suggested? No. Maybe for small number of chunkservers it is better to have odd number, but only because it is more efficient (same safety level in terms of the number of chunkservers that may die is in case of 2N and 2N-1 servers). In your case (3 servers) you may of course add one more, but still only one can die without stopping the cluster. > Because on an even number, splitbrain could arise and quorum can't always be met. (4 nodes, 2 down. Splitbrain and no quorum) No splitbrain, because MORE than half is needed for quorum, so in case of 4 we need 3 for quorum. -- Regards, Jakub Kruszona-Zawadzki - - - - - - - - - - - - - - - - Segmentation fault (core dumped) Phone: +48 602 212 039 |