From: Aleksander W. <ale...@mo...> - 2015-12-09 08:49:29
|
Hi again, Yes you can change this parameter inside /etc/mfs/mfschunkserver.cfg for test: MASTER_TIMEOUT = 120 Can you check your HDD performance? A specially place where metadata are saved. Default it is /var/lib/mfs Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 12/09/2015 09:07 AM, fangyao wrote: > hi. > Dell R420 > CPU:Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz > RAM:48G > mfsmaster used 19G about 68714195 goal 2 files total 25 TiB; > > this is snapshoot on master dumps metadata > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 9058 mfs 6 -19 19.4g 19g 3096 R 100.0 41.2 0:06.51 mfsmaster > 18477 mfs 0 -19 19.4g 19g 4016 S 2.6 41.2 3925:34 mfsmaster > > > system not uses swap because RAM is enough > > by the way > chunkserver disconnect timeout can setting? > > > > 2015-12-09 > ------------------------------------------------------------------------ > *方垚| (8610) 62368638-8906* > ------------------------------------------------------------------------ > *发件人:* Aleksander Wieliczko > *发送时间:* 2015-12-09 15:31:43 > *收件人:* fangyao; moosefs-users > *抄送:* > *主题:* Re: [MooseFS-Users] del and add > Hi. > Thank you for this information. > > First problem that we notice, is that your chunkservers were > disconnected at Dec 9 08:01:36. This kind of behaviour may indicate > that you mfsmaster process is working really hard. > Second information from you syslog was metadata store time. It's about > 100 seconds. > Combining this informations together we can assume that you may > experience SWAPPING problem. > Every hour master dumps metadata as a separate subprocess, so it needs > more RAM during this operation (about 10-20% more than usual). > > So can you check this on your mfsmaster server: > > - Amount of RAM installed in your hardware. > - Amount of RAM used by master. > > Also you can run top at full hour and check if your system uses swap? > It is possible that you need to increase amount of RAM in your master > server. > > Best regards > Aleksander Wieliczko > Technical Support Engineer > MooseFS.com <moosefs.com> > On 12/09/2015 04:39 AM, fangyao wrote: >> hi. >> all server MooseFS version 2.0.80-1 and we edited editchunks.c >> // >> syslog(LOG_WARNING,"chunkserver has nonexistent chunk (%016"PRIX64"_%08"PRIX32"), so create it for future deletion",chunkid,version); >> >> >> syslog >> >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: main master server module: (ip:192.168.1.46) write error: EPIPE (Broken pipe) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.82 / port: 9422, usedspace: 4585512808448 (4270.59 GiB), totalspace: 12073950502912 (11244.74 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.80 / port: 9422, usedspace: 852051292160 (793.53 GiB), totalspace: 2219235217408 (2066.82 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.83 / port: 9422, usedspace: 4585739321344 (4270.80 GiB), totalspace: 12073407995904 (11244.24 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.79 / port: 9422, usedspace: 863847518208 (804.52 GiB), totalspace: 2249241448448 (2094.77 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.81 / port: 9422, usedspace: 860428738560 (801.34 GiB), totalspace: 2240201351168 (2086.35 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.84 / port: 9422, usedspace: 4586834935808 (4271.82 GiB), totalspace: 12073407995904 (11244.24 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.82 / port: 9422, usedspace: 4585512808448 (4270.59 GiB), totalspace: 12073950502912 (11244.74 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.78 / port: 9422, usedspace: 864117645312 (804.77 GiB), totalspace: 2249241448448 (2094.77 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.84:9422,7) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.84 / port: 9422, usedspace: 4586834935808 (4271.82 GiB), totalspace: 12073407995904 (11244.24 GiB) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.81:9422,4) >> Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.81 / port: 9422, usedspace: 860428738560 (801.34 GiB), totalspace: 2240201351168 (2086.35 GiB) >> Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.79:9422,2) >> Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.79 / port: 9422, usedspace: 863847518208 (804.52 GiB), totalspace: 2249241448448 (2094.77 GiB) >> Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.80:9422,3) >> Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.80 / port: 9422, usedspace: 852052361216 (793.54 GiB), totalspace: 2219235217408 (2066.82 GiB) >> Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.83:9422,6) >> Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.83 / port: 9422, usedspace: 4585739321344 (4270.80 GiB), totalspace: 12073407995904 (11244.24 GiB) >> Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.78:9422,1) >> Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.78 / port: 9422, usedspace: 864117645312 (804.77 GiB), totalspace: 2249241448448 (2094.77 GiB) >> Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: child finished >> Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5) >> Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.82 / port: 9422, usedspace: 4585512808448 (4270.59 GiB), totalspace: 12073950502912 (11244.74 GiB) >> Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: store process has finished - store time: 99.380 >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.78 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.82 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.84 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.81 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.79 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.83 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.80 / port: 9422 has been fully removed from data structures >> Dec 9 08:02:20 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.79 / port: 9422 >> Dec 9 08:02:32 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.78 / port: 9422 >> Dec 9 08:02:32 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.80 / port: 9422 >> Dec 9 08:02:32 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.81 / port: 9422 >> Dec 9 08:03:17 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.83 / port: 9422 >> Dec 9 08:03:19 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.84 / port: 9422 >> Dec 9 08:03:20 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.82 / port: 9422 >> Dec 9 08:26:01 mfsmaster1 mfsmaster[18477]: structure check loop >> >> >> >> 2015-12-09 >> ------------------------------------------------------------------------ >> *方垚| (8610) 62368638-8906* >> ------------------------------------------------------------------------ >> *发件人:* Aleksander Wieliczko >> *发送时间:* 2015-12-08 18:52:46 >> *收件人:* fangyao; moosefs-users >> *抄送:* >> *主题:* Re: [MooseFS-Users] del and add >> Hi. >> Would you be so kind and send us some more details from syslog and >> tell us what MooseFS version you have? >> This is to small amount of information to draw some conclusions. >> >> >> Best regards >> Aleksander Wieliczko >> Technical Support Engineer >> MooseFS.com <moosefs.com> >> On 12/08/2015 11:06 AM, fangyao wrote: >>> [root@mfsmaster1 data]# grep 63946327 changelog.*mfs >>> changelog.11.mfs:18378184425: 1449525706|CHUNKDEL(63946327,1) >>> changelog.11.mfs:18378260221: 1449525792|CHUNKADD(63946327,1,1450130592) >>> changelog.21.mfs:18377349098: 1449489681|CHUNKDEL(63946327,1) >>> changelog.21.mfs:18377425708: 1449489774|CHUNKADD(63946327,1,1450094574) >>> changelog.28.mfs:18376418578: 1449464464|CHUNKDEL(63946327,1) >>> changelog.28.mfs:18376495479: 1449464550|CHUNKADD(63946327,1,1450069350) >>> changelog.35.mfs:18375745795: 1449439281|CHUNKDEL(63946327,1) >>> changelog.35.mfs:18375823198: 1449439371|CHUNKADD(63946327,1,1450044171) >>> changelog.37.mfs:18375586289: 1449432091|CHUNKDEL(63946327,1) >>> changelog.37.mfs:18375662761: 1449432181|CHUNKADD(63946327,1,1450036981) >>> changelog.3.mfs:18378924313: 1449554475|CHUNKDEL(63946327,1) >>> changelog.3.mfs:18379012240: 1449554572|CHUNKADD(63946327,1,1450159372) >>> changelog.45.mfs:18374453427: 1449403276|CHUNKDEL(63946327,1) >>> changelog.45.mfs:18374529796: 1449403358|CHUNKADD(63946327,1,1450008158) >>> changelog.6.mfs:18378583919: 1449543696|CHUNKDEL(63946327,1) >>> changelog.6.mfs:18378669372: 1449543789|CHUNKADD(63946327,1,1450148589) >>> >>> this log be write when EPIPE (Broken pipe) and monitor locked unused file Ascending never del >>> almost EPIPE when dump and iowait not high per second 15 >>> we want solve problem EPIPE >>> thx >>> >>> >>> ------------------------------------------------------------------------------ >>> Go from Idea to Many App Stores Faster with Intel(R) XDK >>> Give your users amazing mobile app experiences with Intel(R) XDK. >>> Use one codebase in this all-in-one HTML5 development environment. >>> Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. >>> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140 >>> >>> >>> _________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > |