You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Aleksander W. <ale...@mo...> - 2015-06-10 20:31:20
|
Hi. What is the version of your MooseFS instalation? Best regards Aleksander Wieliczko Technical Support Engineer -----Wiadomość oryginalna----- Od: "Ben Harker" <bj...@ba...> Wysłano: 2015-06-10 22:15 Do: "moo...@li..." <moo...@li...> Temat: [MooseFS-Users] Disappearing chunks on Ubuntu 14.04 chunk server(ZFS) Hello all, afraid I'm out of office so unable to post any logs but I did notice some oddness today; after a reboot, one of my chunkservers (one of a pair) was not listing any of it's chunks on moosefs-cgi, though the process was definitely running and the files were definitely on the drive (...luckily, this is backups of non-critical data!). Chunkservers are both running 1TB ZFS volumes on Ubuntu server VM's, all mfs utils are current as of today. Most of the data had a goal of 2 but some had a goal of 1, and that's the stuff I'm after. I've grep'd and tail'd through various logs but so far haven't come up with anything indicating an issue, can anyone think of any simple things I can check? ((If not, is there a way that I am able to get the chunks back into MFS somehow? I'm still learning the ropes here...)) Many thanks in advance for any help. Ben. ---------------------------------------------------------- This message is sent in confidence for the addressee only. It may contain confidential or sensitive information. The contents are not to be disclosed to any one other than the addressee. Unauthorised recipients are requested to reserve this confidentiality and to advise us of any errors in transmission. Barton Peveril College reserves the right to monitor emails as defined in current legislation, and regards this notice to you as notification of such a possibility. ---------------------------------------------------------- ------------------------------------------------------------------------------ _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Ben H. <bj...@ba...> - 2015-06-10 20:12:15
|
Hello all, afraid I'm out of office so unable to post any logs but I did notice some oddness today; after a reboot, one of my chunkservers (one of a pair) was not listing any of it's chunks on moosefs-cgi, though the process was definitely running and the files were definitely on the drive (...luckily, this is backups of non-critical data!). Chunkservers are both running 1TB ZFS volumes on Ubuntu server VM's, all mfs utils are current as of today. Most of the data had a goal of 2 but some had a goal of 1, and that's the stuff I'm after. I've grep'd and tail'd through various logs but so far haven't come up with anything indicating an issue, can anyone think of any simple things I can check? ((If not, is there a way that I am able to get the chunks back into MFS somehow? I'm still learning the ropes here...)) Many thanks in advance for any help. Ben. ---------------------------------------------------------- This message is sent in confidence for the addressee only. It may contain confidential or sensitive information. The contents are not to be disclosed to any one other than the addressee. Unauthorised recipients are requested to reserve this confidentiality and to advise us of any errors in transmission. Barton Peveril College reserves the right to monitor emails as defined in current legislation, and regards this notice to you as notification of such a possibility. ---------------------------------------------------------- |
From: Krzysztof K. <krz...@mo...> - 2015-06-10 09:33:15
|
Dear Eugene, In MooseFS 2.0.68 source you’ll find, that the message “connect failed, error: ETIMEDOUT” is logged by mainserv_connect() function in mfschunkserver, when TCP connection times out. 167 int mainserv_connect(uint32_t fwdip,uint16_t fwdport,uint32_t timeout) { 168 int fwdsock; 169 fwdsock = tcpsocket(); 170 if (fwdsock<0) { 171 mfs_errlog(LOG_WARNING,"create socket, error"); 172 return -1; 173 } 174 if (tcpnonblock(fwdsock)<0) { 175 mfs_errlog(LOG_WARNING,"set nonblock, error"); 176 tcpclose(fwdsock); 177 return -1; 178 } 179 if (tcpnumtoconnect(fwdsock,fwdip,fwdport,timeout)<0) { 180 mfs_errlog(LOG_WARNING,"connect failed, error"); 181 tcpclose(fwdsock); 182 return -1; 183 } 184 if (tcpnodelay(fwdsock)<0) { 185 mfs_errlog(LOG_WARNING,"can't set TCP_NODELAY, error"); 186 } 187 return fwdsock; 188 } The function mainserv_connect() is called in the following loop within mfschunkserver's mainserv_write() function, that retires the connection if it was not successful (i.e. resulted in ETIMEDOUT): 1112 for (i=0 ; i<CONNECT_RETRIES && fwdsock<0 ; i++) { 1113 #ifdef USE_CONNCACHE 1114 if (i==0) { 1115 fwdsock = conncache_get(fwdip,fwdport); 1116 } 1117 if (fwdsock<0) { 1118 fwdsock = mainserv_connect(fwdip,fwdport,CONNECT_TIMEOUT(i)); 1119 } 1120 #else 1121 fwdsock = mainserv_connect(fwdip,fwdport,CONNECT_TIMEOUT(i)); 1122 #endif 1123 if (fwdsock>=0) { 1124 packet = mainserv_create_packet(&wptr,CLTOCS_WRITE,length-6); 1125 if (protover) { 1126 put8bit(&wptr,protover); 1127 } 1128 put64bit(&wptr,gchunkid); 1129 put32bit(&wptr,gversion); 1130 if (protover) { 1131 memcpy(wptr,data,length-13-6); 1132 } else { 1133 memcpy(wptr,data,length-12-6); 1134 } 1135 if (mainserv_send_and_free(fwdsock,packet,length-6)) { 1136 break; 1137 } 1138 tcpclose(fwdsock); 1139 fwdsock=-1; 1140 } 1141 } As you can see each time the connection is retired (by design system is retrying 10 times) the timeout value is adjusted according to the following formulas: 63 #define CONNECT_RETRIES 10 64 #define CONNECT_TIMEOUT(cnt) (((cnt)%2)?(300*(1<<((cnt)>>1))):(200*(1<<((cnt)>>1)))) Using simple program provided below you can see what the timeout values are for each iteration: 1 #include <stdio.h> 2 3 #define CONNECT_RETRIES 10 4 #define CONNECT_TIMEOUT(cnt) (((cnt)%2)?(300*(1<<((cnt)>>1))):(200*(1<<((cnt)>>1)))) 5 6 int main() { 7 for(int i=0; i<CONNECT_RETRIES; i++) { 8 printf("%d, ", CONNECT_TIMEOUT(i)); 9 } 10 printf("\n"); 11 return 0; 12 } $ gcc -o retry retry.c $ ./retry 200, 300, 400, 600, 800, 1200, 1600, 2400, 3200, 4800, So in your case it might be some network issue (network congestion?), that resulted in few retries on TCP connect. It’s also important to note, that although there were some timeouts, the request was processed successfully due to the fact that the system has retry capability and the message is a WARNING not an ERROR that’s why in changelog you will not see any issues. Should the connection fail after predefined number of retries (default 10) the client would get the packet back with the information that there was an error during I/O transaction processing. Message "got unknown message (type:212)” was a result of the small bug in processing i/o when timeout occurs and has beed fixed in the release that we will release today/tomorrow. Best Regards, Krzysztof Kielak Director of Operations and Customer Support Mobile: +48 601 476 440 > On 10 Jun 2015, at 07:13, Eugene Diatlov <it...@da...> wrote: > > Hi, > > How to debug such messages? > > running 2.0.68 version. > > tail -f /var/log/daemon.log > Jun 10 01:05:29 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:29 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:29 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:30 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:30 node1 mfschunkserver[1603]: got unknown message (type:212) > Jun 10 01:05:30 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:30 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:31 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:31 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:32 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) > Jun 10 01:05:32 node1 mfschunkserver[1603]: got unknown message (type:212) > > At the same time other logs (and system at all) looks stable. > tail -f ./changelog_ml.0.mfs > 55486822: 1433913127|CREATE(34818,77615220398_1_20150610015234_00000128.mp4,1,420,18,0,0,0):3570 > 55486823: 1433913127|ACQUIRE(321087,3570) > 55486824: 1433913127|CREATE(17649,77619977594_1_20150610030115_00000085.mp4,1,420,18,0,0,0):3574 > 55486825: 1433913127|ACQUIRE(321090,3574) > 55486826: 1433913127|WRITE(3570,0,1):7076220 > 55486827: 1433913127|CREATE(4511,77618939592_1_20150609125244_00000612.mp4,1,420,18,0,0,0):3575 > 55486828: 1433913127|ACQUIRE(321089,3575) > 55486829: 1433913127|CREATE(172741,77612131058_1_20150610040406_00000043.mp4,1,420,18,0,0,0):3576 > 55486830: 1433913127|ACQUIRE(321088,3576) > 55486831: 1433913127|WRITE(3574,0,1):7076221 > 55486832: 1433913127|WRITE(3576,0,1):7076222 > 55486833: 1433913127|WRITE(3575,0,1):7076223 > 55486834: 1433913127|LENGTH(3574,4187102) > 55486835: 1433913127|UNLOCK(7076221) > > > -- > Best regards, > Head of R&D department > Eugene Diatlov > http://datalink.ua <http://datalink.ua/> > ------------------------------------------------------------------------------ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Eugene D. <it...@da...> - 2015-06-10 05:30:43
|
Hi, How to debug such messages? running 2.0.68 version. tail -f /var/log/daemon.log Jun 10 01:05:29 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:29 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:29 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:30 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:30 node1 mfschunkserver[1603]: got unknown message (type:212) Jun 10 01:05:30 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:30 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:31 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:31 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:32 node1 mfschunkserver[1603]: connect failed, error: ETIMEDOUT (Operation timed out) Jun 10 01:05:32 node1 mfschunkserver[1603]: got unknown message (type:212) At the same time other logs (and system at all) looks stable. tail -f ./changelog_ml.0.mfs 55486822: 1433913127|CREATE(34818,77615220398_1_20150610015234_00000128.mp4,1,420,18,0,0,0):3570 55486823: 1433913127|ACQUIRE(321087,3570) 55486824: 1433913127|CREATE(17649,77619977594_1_20150610030115_00000085.mp4,1,420,18,0,0,0):3574 55486825: 1433913127|ACQUIRE(321090,3574) 55486826: 1433913127|WRITE(3570,0,1):7076220 55486827: 1433913127|CREATE(4511,77618939592_1_20150609125244_00000612.mp4,1,420,18,0,0,0):3575 55486828: 1433913127|ACQUIRE(321089,3575) 55486829: 1433913127|CREATE(172741,77612131058_1_20150610040406_00000043.mp4,1,420,18,0,0,0):3576 55486830: 1433913127|ACQUIRE(321088,3576) 55486831: 1433913127|WRITE(3574,0,1):7076221 55486832: 1433913127|WRITE(3576,0,1):7076222 55486833: 1433913127|WRITE(3575,0,1):7076223 55486834: 1433913127|LENGTH(3574,4187102) 55486835: 1433913127|UNLOCK(7076221) -- Best regards, Head of R&D department Eugene Diatlov http://datalink.ua |
From: Aleksander W. <ale...@mo...> - 2015-06-09 12:32:12
|
Hi. We would like to explain that we unified names of installation packages and init scripts names from 2.0.60 version. All information about name changing was mentioned on our website: https://moosefs.com/download.html Also our documentation contains information about new sysv and systemd naming: https://moosefs.com/Content/Downloads/MooseFS-2-0-60-User-Manual.pdf So during installation or update to version 2.0.6x and above, system will not transfer old init script state because for him it's not the same package. Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 08.06.2015 08:04, Tom Ivar Helbekkmo wrote: > We had a bit of a surprise the other day. We've been running MooseFS > Pro version 2.0.43 on RHEL 6, and upgraded to 2.0.62. Everything went > smoothly, of course -- until there was a restart of one of the servers, > and the mfs services didn't start at boot. > > It turns out that the init scripts have changed names, and the upgrade > does not duplicate the state of the old ones onto the new. A manual > 'chkconfig --add' and 'chkconfig on' is needed for each of them. > > -tih |
From: Aleksander W. <ale...@mo...> - 2015-06-08 07:42:51
|
Hi. GPG key has been updated. It's available at: http://ppa.moosefs.com/moosefs.key Also you can find updated MooseFS gpg key at key servers like: http://keyserver.ubuntu.com https://pgp.mit.edu/ Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 08.06.2015 03:55, Neddy, NH. Nam wrote: > Hi, Moosefs's GPG key is expired: > > pub 2048R/CF82ADBA 2014-06-04 [expired: 2015-06-04] > uid MooseFS Development Team (Official MooseFS > Repositories) <su...@mo... <mailto:su...@mo...>> > > > ------------------------------------------------------------------------------ > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Aleksander W. <ale...@mo...> - 2015-06-08 07:14:29
|
Hi. I would like to inform that at this moment MooseFS High Availability solution is developed only in PRO version. If you are interested in some MooseFS PRO version tests please write directly to su...@mo... Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 01.06.2015 08:14, zh...@gm... wrote: > hi,i want to know does the CE version support this > "if a leader master stops working,a follower master is immediately > ready to take on the role of leader" > > glad to get for your replay. > > > > ps: > i used MooseFS at the version 1.6.x,and at that version,i used other > ways like lvs drbd to solve the fail over,but i didn't like this > way.so at last i didn't used the MooseFS. And now i note there new > version ,seemed had solve the single master server,i am glad to keep > on it. > > ------------------------------------------------------------------------ > zh...@gm... > > > ------------------------------------------------------------------------------ > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Tom I. H. <ti...@ha...> - 2015-06-08 06:04:30
|
We had a bit of a surprise the other day. We've been running MooseFS Pro version 2.0.43 on RHEL 6, and upgraded to 2.0.62. Everything went smoothly, of course -- until there was a restart of one of the servers, and the mfs services didn't start at boot. It turns out that the init scripts have changed names, and the upgrade does not duplicate the state of the old ones onto the new. A manual 'chkconfig --add' and 'chkconfig on' is needed for each of them. -tih -- Popularity is the hallmark of mediocrity. --Niles Crane, "Frasier" |
From: Neddy, N. N. <na...@nd...> - 2015-06-08 02:25:07
|
Hi, Moosefs's GPG key is expired: pub 2048R/CF82ADBA 2014-06-04 [expired: 2015-06-04] uid MooseFS Development Team (Official MooseFS Repositories) <su...@mo...> |
From: <zh...@gm...> - 2015-06-01 06:14:11
|
hi,i want to know does the CE version support this "if a leader master stops working,a follower master is immediately ready to take on the role of leader" glad to get for your replay. ps: i used MooseFS at the version 1.6.x,and at that version,i used other ways like lvs drbd to solve the fail over,but i didn't like this way.so at last i didn't used the MooseFS. And now i note there new version ,seemed had solve the single master server,i am glad to keep on it. zh...@gm... |
From: Piotr R. K. <pio...@mo...> - 2015-05-29 09:48:33
|
Hello, please download and install MooseFS 2.0.68 from packages or sources: http://get.moosefs.com[1] Version 1.6.x has a lot of bugs (which has been fixed in 2.0.x) and is not supported any more. Best regards, Piotr Robert Konopelko *MooseFS Technical Support Engineer* | moosefs.com[2] I installed mfs-1.6.27-5 on centos 5.11,it has seven chunkservers ,total disk size is 56T.The mfs filesystem has 72229209 files.Today ,the mfs client /var/log/messages shows "no chunkservers",and mfs filesystem hang.the mfsmaster server /var/log/messages continually print : " chunk-server alrerady connectedconnection with CS(x.x.x.x) has been closed by peer",and the mfsmaster OS has a lot of mfs related TCP connection . help! -------- [1] http://get.moosefs.com [2] http://moosefs.com |
From: 九日 <209...@qq...> - 2015-05-29 04:30:54
|
I installed mfs-1.6.27-5 on centos 5.11,it has seven chunkservers ,total disk size is 56T.The mfs filesystem has 72229209 files. Today ,the mfs client /var/log/messages shows "no chunkservers",and mfs filesystem hang. the mfsmaster server /var/log/messages continually print : " chunk-server alrerady connected connection with CS(x.x.x.x) has been closed by peer ",and the mfsmaster OS has a lot of mfs related TCP connection . help! |
From: Tom I. H. <ti...@ha...> - 2015-05-18 10:40:16
|
Aleksander Wieliczko <ale...@mo...> writes: > Also your kernel part of fuse has to support this mechanism. We mainly > tested in on Linux with kernel 3.13 (ubuntu 14.04). > As I remember there are no support for locking in other operating systems. I just ran some tests, using this tool: ftp://ftp.software.ibm.com/software/nfs/backup/beta/testlock.c Both flock and lockf style locking work exactly as they should between two clients running MooseFS 3.0.22 on NetBSD-current and Ubuntu 1504, including correctly handling shared and exclusive locking with flock. Nice! :) -tih -- Popularity is the hallmark of mediocrity. --Niles Crane, "Frasier" |
From: Aleksander W. <ale...@mo...> - 2015-05-18 04:36:28
|
Hello! Global locking support has been added in moosefs 3.0. There are two locking mechanisms supported by this version: "posix_locks" (ioctl,lockf) and "bsd_locks" (flock). Global locking also needs to be supported by your version of fuse library and kernel. "posix_locks" were introduced in fuse version 2.6 and "bsd_locks" were introduced in fuse version 2.9. Also your kernel part of fuse has to support this mechanism. We mainly tested in on Linux with kernel 3.13 (ubuntu 14.04). As I remember there are no support for locking in other operating systems. Be aware that there were bug in moosefs up to version 3.0.20 in "flock" locking mechanism, so if you want to use "flock" be sure that you have latest moosefs version (at least 3.0.21). Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 05/15/2015 11:21 AM, Alexey Tsivinsky wrote: > Good day! > > Please, advise, how i can eneble global file locking on MooseFS 20.0.60 > setup, and how i can check it? > > Thanks. > |
From: Rashit A. <ras...@ya...> - 2015-05-16 11:13:36
|
<div>Great! Thank you for quick response!</div><div> </div><div>15.05.2015, 15:11, "Aleksander Wieliczko" <ale...@mo...>:</div><blockquote type="cite"><div bgcolor="#FFFFFF">Hi Rashit!<br /> Thank you for this informations.<br /> This helped us to remove hidden dircache invalidation bug.<br /> <br /> Right now new version of MooseFS with fix is available on our website.<br /> <br /> 2.0.68 - stable version<br /> 3.0.22 - current version<br /> <br /><div>Best regards<br /> Aleksander Wieliczko<br /> Technical Support Engineer<br /> <a>MooseFS.com</a><br /> </div><div>On 05/14/2015 01:48 PM, Rashit Azizbaev wrote:</div><blockquote cite="mid:625...@we..." type="cite"><pre>Hello! We're using 2.0.60 version of Moosefs and in some rare cases have problem opening file for writing stored on Moosefs file system. This happens just after the deletion of this file and I believe this error is somehow connected with dircache. We have following code: void File::Create(const char *Name) throw (std::exception) { c_name = Name; /*удаление файла*/ HSDelete(c_name); /*создание файла*/ ff = fopen(c_name, "w"); if (ff == NULL) { int errnum = errno; throw FR::Exception(errnum, "File::Create(%s)\ncannot CREATE file !!!", c_name.txt()); } } int HSDelete(const char *name) { int ret; char * path = strdup(name); path = dirname(path); DIR *dir = opendir(path); if (dir) { Diags::Debug("HSDelete: scanning directory %s", path); struct dirent *de = 0; do { de = readdir(dir); } while(de); closedir(dir); } else Diags::Error(errno, "HSDelete: Cannot open directory %s", path); Diags::Debug("HSDelete: deleting file %s", name); ret = unlink(name); if (ret != 0 && err != ENOENT) Diags::Error(errno, "HSDelete: Cannot delete file %s", name); return ret; } This code is executed on different files in parallel and in about 10% cases fopen returns 'No such file' error This is how it seen in oplog (for file data/aprelevskoe/Lines/bl7_1_I_217/DepthModels/TEST_FILE_SYS/depthMigration.mig): 05.06 19:07:22.289334: uid:1002 gid:1000 pid:15476 cmd:unlink (3046656,depthMigration.mig): OK 05.06 19:07:22.291324: uid:1002 gid:1000 pid:15476 cmd:lookup (1,data): OK (0.0,110,1.0,[drwxrwx---:0040770,69,1002,1000,1430927232,1430843190,1430843190,4041897]) 05.06 19:07:22.293070: uid:1002 gid:1000 pid:15476 cmd:lookup (110,aprelevskoe): OK (0.0,4173806,1.0,[drwxrwxr-x:0040775,12,1001,1000,1430927234,1429271510,1429271510,4000266]) 05.06 19:07:22.294813: uid:1002 gid:1000 pid:15476 cmd:lookup (4173806,Lines): OK (0.0,4173818,1.0,[drwxrwxr-x:0040775,1566,1001,1000,1430837413,1428610404,1428610404,3011673]) 05.06 19:07:22.296328: uid:1002 gid:1000 pid:15476 cmd:lookup (4173818,bl7_1_I_217): OK (0.0,4340334,1.0,[drwxrwxr-x:0040775,7,1001,1000,1429999627,1426953017,1426953017,2023511]) 05.06 19:07:22.297866: uid:1002 gid:1000 pid:15476 cmd:lookup (4340334,DepthModels): OK (0.0,4340341,1.0,[drwxrwxr-x:0040775,5,1001,1000,1429999627,1430928238,1430928238,2016702]) 05.06 19:07:22.299618: uid:1002 gid:1000 pid:15476 cmd:lookup (4340341,TEST_FILE_SYS): OK (0.0,3046656,1.0,[drwxrwxr-x:0040775,3,1002,1000,1430928442,1430928442,1430928442,2000210]) 05.06 19:07:22.299721: uid:1002 gid:1000 pid:15476 cmd:lookup (3046656,depthMigration.mig) (using open dir cache): OK (0.0,3088005,1.0,[-rw-rw-r--:0100664,1,1002,1000,1430928276,1430928303,1430928303,2207520]) 05.06 19:07:22.301303: uid:1002 gid:1000 pid:15476 cmd:open (3088005): ENOENT (No such file or directory) Looks like unlink doesn't invalidate dircache entry and lookup returns cached result, which is no longer exist. We recompiled moosefs client binary with disabled dircache and it eliminates problem, but has some perfomance costs. We haven't yet tested latest 2.0.67 version, do you believe that bug can be fixed in it? ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. <a href="http://ad.doubleclick.net/ddm/clk/290420510;117567292;y">http://ad.doubleclick.net/ddm/clk/290420510;117567292;y</a> _________________________________________ moosefs-users mailing list <a href="mailto:moo...@li...">moo...@li...</a> <a href="https://lists.sourceforge.net/lists/listinfo/moosefs-users">https://lists.sourceforge.net/lists/listinfo/moosefs-users</a> </pre></blockquote></div></blockquote> |
From: Aleksander W. <ale...@mo...> - 2015-05-15 12:11:32
|
Hi Rashit! Thank you for this informations. This helped us to remove hidden dircache invalidation bug. Right now new version of MooseFS with fix is available on our website. 2.0.68 - stable version 3.0.22 - current version Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 05/14/2015 01:48 PM, Rashit Azizbaev wrote: > Hello! > > We're using 2.0.60 version of Moosefs and in some rare cases have problem opening file for writing stored on Moosefs file system. This happens just after the deletion of this file and I believe this error is somehow connected with dircache. We have following code: > > void File::Create(const char *Name) throw (std::exception) > { > c_name = Name; > > /*удаление файла*/ > HSDelete(c_name); > > /*создание файла*/ > ff = fopen(c_name, "w"); > > if (ff == NULL) { > int errnum = errno; > throw FR::Exception(errnum, "File::Create(%s)\ncannot CREATE file !!!", c_name.txt()); > } > } > int HSDelete(const char *name) > { > int ret; > > char * path = strdup(name); > path = dirname(path); > DIR *dir = opendir(path); > if (dir) { > Diags::Debug("HSDelete: scanning directory %s", path); > struct dirent *de = 0; > do { > de = readdir(dir); > } while(de); > closedir(dir); > } > else > Diags::Error(errno, "HSDelete: Cannot open directory %s", path); > > Diags::Debug("HSDelete: deleting file %s", name); > > ret = unlink(name); > > if (ret != 0 && err != ENOENT) > Diags::Error(errno, "HSDelete: Cannot delete file %s", name); > > return ret; > } > > This code is executed on different files in parallel and in about 10% cases fopen returns 'No such file' error > > This is how it seen in oplog (for file data/aprelevskoe/Lines/bl7_1_I_217/DepthModels/TEST_FILE_SYS/depthMigration.mig): > > 05.06 19:07:22.289334: uid:1002 gid:1000 pid:15476 cmd:unlink (3046656,depthMigration.mig): OK > 05.06 19:07:22.291324: uid:1002 gid:1000 pid:15476 cmd:lookup (1,data): OK (0.0,110,1.0,[drwxrwx---:0040770,69,1002,1000,1430927232,1430843190,1430843190,4041897]) > 05.06 19:07:22.293070: uid:1002 gid:1000 pid:15476 cmd:lookup (110,aprelevskoe): OK (0.0,4173806,1.0,[drwxrwxr-x:0040775,12,1001,1000,1430927234,1429271510,1429271510,4000266]) > 05.06 19:07:22.294813: uid:1002 gid:1000 pid:15476 cmd:lookup (4173806,Lines): OK (0.0,4173818,1.0,[drwxrwxr-x:0040775,1566,1001,1000,1430837413,1428610404,1428610404,3011673]) > 05.06 19:07:22.296328: uid:1002 gid:1000 pid:15476 cmd:lookup (4173818,bl7_1_I_217): OK (0.0,4340334,1.0,[drwxrwxr-x:0040775,7,1001,1000,1429999627,1426953017,1426953017,2023511]) > 05.06 19:07:22.297866: uid:1002 gid:1000 pid:15476 cmd:lookup (4340334,DepthModels): OK (0.0,4340341,1.0,[drwxrwxr-x:0040775,5,1001,1000,1429999627,1430928238,1430928238,2016702]) > 05.06 19:07:22.299618: uid:1002 gid:1000 pid:15476 cmd:lookup (4340341,TEST_FILE_SYS): OK (0.0,3046656,1.0,[drwxrwxr-x:0040775,3,1002,1000,1430928442,1430928442,1430928442,2000210]) > 05.06 19:07:22.299721: uid:1002 gid:1000 pid:15476 cmd:lookup (3046656,depthMigration.mig) (using open dir cache): OK (0.0,3088005,1.0,[-rw-rw-r--:0100664,1,1002,1000,1430928276,1430928303,1430928303,2207520]) > 05.06 19:07:22.301303: uid:1002 gid:1000 pid:15476 cmd:open (3088005): ENOENT (No such file or directory) > > Looks like unlink doesn't invalidate dircache entry and lookup returns cached result, which is no longer exist. We recompiled moosefs client binary with disabled dircache and it eliminates problem, but has some perfomance costs. > > We haven't yet tested latest 2.0.67 version, do you believe that bug can be fixed in it? > > ------------------------------------------------------------------------------ > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Alexey T. <Ale...@ba...> - 2015-05-15 09:38:28
|
Good day! Please, advise, how i can eneble global file locking on MooseFS 20.0.60 setup, and how i can check it? Thanks. -- Best Regards! Alexey Tsivinsky e-mail: it...@ba... |
From: Rashit A. <ras...@ya...> - 2015-05-14 12:06:00
|
Hello! We're using 2.0.60 version of Moosefs and in some rare cases have problem opening file for writing stored on Moosefs file system. This happens just after the deletion of this file and I believe this error is somehow connected with dircache. We have following code: void File::Create(const char *Name) throw (std::exception) { c_name = Name; /*удаление файла*/ HSDelete(c_name); /*создание файла*/ ff = fopen(c_name, "w"); if (ff == NULL) { int errnum = errno; throw FR::Exception(errnum, "File::Create(%s)\ncannot CREATE file !!!", c_name.txt()); } } int HSDelete(const char *name) { int ret; char * path = strdup(name); path = dirname(path); DIR *dir = opendir(path); if (dir) { Diags::Debug("HSDelete: scanning directory %s", path); struct dirent *de = 0; do { de = readdir(dir); } while(de); closedir(dir); } else Diags::Error(errno, "HSDelete: Cannot open directory %s", path); Diags::Debug("HSDelete: deleting file %s", name); ret = unlink(name); if (ret != 0 && err != ENOENT) Diags::Error(errno, "HSDelete: Cannot delete file %s", name); return ret; } This code is executed on different files in parallel and in about 10% cases fopen returns 'No such file' error This is how it seen in oplog (for file data/aprelevskoe/Lines/bl7_1_I_217/DepthModels/TEST_FILE_SYS/depthMigration.mig): 05.06 19:07:22.289334: uid:1002 gid:1000 pid:15476 cmd:unlink (3046656,depthMigration.mig): OK 05.06 19:07:22.291324: uid:1002 gid:1000 pid:15476 cmd:lookup (1,data): OK (0.0,110,1.0,[drwxrwx---:0040770,69,1002,1000,1430927232,1430843190,1430843190,4041897]) 05.06 19:07:22.293070: uid:1002 gid:1000 pid:15476 cmd:lookup (110,aprelevskoe): OK (0.0,4173806,1.0,[drwxrwxr-x:0040775,12,1001,1000,1430927234,1429271510,1429271510,4000266]) 05.06 19:07:22.294813: uid:1002 gid:1000 pid:15476 cmd:lookup (4173806,Lines): OK (0.0,4173818,1.0,[drwxrwxr-x:0040775,1566,1001,1000,1430837413,1428610404,1428610404,3011673]) 05.06 19:07:22.296328: uid:1002 gid:1000 pid:15476 cmd:lookup (4173818,bl7_1_I_217): OK (0.0,4340334,1.0,[drwxrwxr-x:0040775,7,1001,1000,1429999627,1426953017,1426953017,2023511]) 05.06 19:07:22.297866: uid:1002 gid:1000 pid:15476 cmd:lookup (4340334,DepthModels): OK (0.0,4340341,1.0,[drwxrwxr-x:0040775,5,1001,1000,1429999627,1430928238,1430928238,2016702]) 05.06 19:07:22.299618: uid:1002 gid:1000 pid:15476 cmd:lookup (4340341,TEST_FILE_SYS): OK (0.0,3046656,1.0,[drwxrwxr-x:0040775,3,1002,1000,1430928442,1430928442,1430928442,2000210]) 05.06 19:07:22.299721: uid:1002 gid:1000 pid:15476 cmd:lookup (3046656,depthMigration.mig) (using open dir cache): OK (0.0,3088005,1.0,[-rw-rw-r--:0100664,1,1002,1000,1430928276,1430928303,1430928303,2207520]) 05.06 19:07:22.301303: uid:1002 gid:1000 pid:15476 cmd:open (3088005): ENOENT (No such file or directory) Looks like unlink doesn't invalidate dircache entry and lookup returns cached result, which is no longer exist. We recompiled moosefs client binary with disabled dircache and it eliminates problem, but has some perfomance costs. We haven't yet tested latest 2.0.67 version, do you believe that bug can be fixed in it? |
From: Eduardo K. <edu...@do...> - 2015-04-30 17:38:56
|
Dear Krzysztof Kielak, I do not have performance problems, I've only seen the change and caught my attention. Best Regards, Eduardo El Martes 07 abril 2015 23:00:36 Krzysztof Kielak escribió: > Do you see any performance problems with deletions on your MooseFS 2.0.x system or you are just concerned that the behaviour changed after the upgrade? -- ------------------------------------------------- Eduardo Kellenberger Departamento de Infraestructura Tecnológica DonWeb "La actitud es todo" Donweb.com Nota de confidencialidad: Este mensaje y archivos adjuntos al mismo son confidenciales, de uso exclusivo para el destinatario del mismo. La divulgación y/o uso del mismo sin autorización por parte de DonWeb.com queda prohibida. DonWeb.com no se hace responsable del mensaje por la falsificación y/o alteración del mismo. De no ser Ud el destinatario del mismo y lo ha recibido por error, por favor, notifique al remitente y elimínelo de su sistema. Confidentiality Note: This message and any attachments (the message) are confidential and intended solely for the addressees. Any unauthorised use or dissemination is prohibited by DonWeb.com. DonWeb.com shall not be liable for the message if altered or falsified. If you are not the intended addressee of this message, please cancel it immediately and inform the sender Nota de Confidencialidade: Esta mensagem e seus eventuais anexos podem conter dados confidenciais ou privilegiados. Se você os recebeu por engano ou não é um dos destinatários aos quais ela foi endereçada, por favor destrua-a e a todos os seus eventuais anexos ou copias realizadas, imediatamente. É proibida a retenção, distribuição, divulgação ou utilização de quaisquer informações aqui contidas. Por favor, informenos sobre o recebimento indevido desta mensagem, retornando-a para o autor. |
From: web u. <web...@gm...> - 2015-04-23 09:47:23
|
Problem was solved by setting up NAT rules and opening the firewall ports. Thanks for your help! On Thu, Apr 23, 2015 at 5:45 AM, Aleksander Wieliczko < ale...@mo...> wrote: > Hi > Are you using DNS name or ip address for MooseFS client mount? > > MooseFS client use this ports: 9422 for chunkservers communication and > 9421 for master communication. > Please check if you can telnet to all chunkservers and master computers > after vpn connection. > > When we making some changes in mfsexports.cfg file we need to send reload > command or HUP signal to mfsmaster > > # mfsmaster reload > or > # kill -HUP [mfsmaster PID] > or > #kill -1 [mfsmaster PID] > > Best regards > Aleksander Wieliczko > Technical Support Engineer > MooseFS.com <http://moosefs.com> > > On 04/23/2015 01:19 AM, web user wrote: > > So my setup is probably a little more complicated then most. I can mount > the mfs fine inside my network. I'm trying to do dev work from home and > have setup a vpn. I can correctly ssh into the server and look up web sites > running on the internal network. > > I'm also running a vmware player session. I can't mount the mfs folder > inside the linux session. When I make changes to mfsexports do I need to > restart anything? > > Any idea on how I can debug what is going on here. > > What port does mfsmount use? Does it just go over ssh? > > Regards, > > WU > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exerciseshttp://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > > > > _________________________________________ > moosefs-users mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > |
From: Aleksander W. <ale...@mo...> - 2015-04-23 09:46:08
|
Hi Are you using DNS name or ip address for MooseFS client mount? MooseFS client use this ports: 9422 for chunkservers communication and 9421 for master communication. Please check if you can telnet to all chunkservers and master computers after vpn connection. When we making some changes in mfsexports.cfg file we need to send reload command or HUP signal to mfsmaster # mfsmaster reload or # kill -HUP [mfsmaster PID] or #kill -1 [mfsmaster PID] Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 04/23/2015 01:19 AM, web user wrote: > So my setup is probably a little more complicated then most. I can > mount the mfs fine inside my network. I'm trying to do dev work from > home and have setup a vpn. I can correctly ssh into the server and > look up web sites running on the internal network. > > I'm also running a vmware player session. I can't mount the mfs folder > inside the linux session. When I make changes to mfsexports do I need > to restart anything? > > Any idea on how I can debug what is going on here. > > What port does mfsmount use? Does it just go over ssh? > > Regards, > > WU > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: web u. <web...@gm...> - 2015-04-22 23:19:43
|
So my setup is probably a little more complicated then most. I can mount the mfs fine inside my network. I'm trying to do dev work from home and have setup a vpn. I can correctly ssh into the server and look up web sites running on the internal network. I'm also running a vmware player session. I can't mount the mfs folder inside the linux session. When I make changes to mfsexports do I need to restart anything? Any idea on how I can debug what is going on here. What port does mfsmount use? Does it just go over ssh? Regards, WU |
From: Ricardo J. B. <ric...@do...> - 2015-04-20 18:34:31
|
I believe MFS info (the CGI) shows you total space, not usable space, so in your case it would show ~30 TB. I have 2 small clusters, 2 chunkservers per cluster, 4 x 1TB disks per chunkserver and MFS shows me 7.2 TB total for each cluster. I have goal=2 so I know I have ~3.5 TB usable, but since you can mix (i.e. have some dirs with different goal if you have more than 2 chunkservers) I guess it's hard to estimate usable diskspace from th CGI. Cheers, El Lunes 20/04/2015, Neddy, NH. Nam escribió: > Hi Steve, > > I've thought the same, but if the numbers of HDD is odd, my > calculation is different than MFS info. It's confuse me. > > Thanks. > > On Mon, Apr 20, 2015 at 10:48 PM, Steve Wilson <st...@pu...> wrote: > > Hi, > > > > I would think that should be: > > 4 chunkservers * 2 disks * 4TB / 3 copies = 10.7 > > Minus, of course, some percentage for file system overhead, etc. > > > > Steve > > > > On 04/20/2015 11:43 AM, Neddy, NH. Nam wrote: > >> Hi, > >> > >> I'm not a smart man, can somebody help me to find out equation to > >> calculate Total Space of Moosefs? For example: 4 chunkservers which > >> have 2x4TB HDDs each, and goal = 3? > >> > >> Thanks, -- Ricardo J. Barberis Senior SysAdmin / IT Architect DonWeb La Actitud Es Todo www.DonWeb.com _____ |
From: Neddy, N. N. <na...@nd...> - 2015-04-20 15:59:32
|
Hi Steve, I've thought the same, but if the numbers of HDD is odd, my calculation is different than MFS info. It's confuse me. Thanks. On Mon, Apr 20, 2015 at 10:48 PM, Steve Wilson <st...@pu...> wrote: > Hi, > > I would think that should be: > 4 chunkservers * 2 disks * 4TB / 3 copies = 10.7 > Minus, of course, some percentage for file system overhead, etc. > > Steve > > > On 04/20/2015 11:43 AM, Neddy, NH. Nam wrote: >> >> Hi, >> >> I'm not a smart man, can somebody help me to find out equation to >> calculate Total Space of Moosefs? For example: 4 chunkservers which >> have 2x4TB HDDs each, and goal = 3? >> >> Thanks, >> >> >> ------------------------------------------------------------------------------ >> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT >> Develop your own process in accordance with the BPMN 2 standard >> Learn Process modeling best practices with Bonita BPM through live >> exercises >> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- >> event?utm_ >> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Steve W. <st...@pu...> - 2015-04-20 15:49:06
|
Hi, I would think that should be: 4 chunkservers * 2 disks * 4TB / 3 copies = 10.7 Minus, of course, some percentage for file system overhead, etc. Steve On 04/20/2015 11:43 AM, Neddy, NH. Nam wrote: > Hi, > > I'm not a smart man, can somebody help me to find out equation to > calculate Total Space of Moosefs? For example: 4 chunkservers which > have 2x4TB HDDs each, and goal = 3? > > Thanks, > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |