From: f. <fa...@eb...> - 2015-12-09 08:08:12
|
hi. Dell R420 CPU:Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz RAM:48G mfsmaster used 19G about 68714195 goal 2 files total 25 TiB; this is snapshoot on master dumps metadata PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9058 mfs 6 -19 19.4g 19g 3096 R 100.0 41.2 0:06.51 mfsmaster 18477 mfs 0 -19 19.4g 19g 4016 S 2.6 41.2 3925:34 mfsmaster system not uses swap because RAM is enough by the way chunkserver disconnect timeout can setting? 2015-12-09 方垚| (8610) 62368638-8906 发件人: Aleksander Wieliczko 发送时间: 2015-12-09 15:31:43 收件人: fangyao; moosefs-users 抄送: 主题: Re: [MooseFS-Users] del and add Hi. Thank you for this information. First problem that we notice, is that your chunkservers were disconnected at Dec 9 08:01:36. This kind of behaviour may indicate that you mfsmaster process is working really hard. Second information from you syslog was metadata store time. It's about 100 seconds. Combining this informations together we can assume that you may experience SWAPPING problem. Every hour master dumps metadata as a separate subprocess, so it needs more RAM during this operation (about 10-20% more than usual). So can you check this on your mfsmaster server: - Amount of RAM installed in your hardware. - Amount of RAM used by master. Also you can run top at full hour and check if your system uses swap? It is possible that you need to increase amount of RAM in your master server. Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com On 12/09/2015 04:39 AM, fangyao wrote: hi. all server MooseFS version 2.0.80-1 and we edited editchunks.c // syslog(LOG_WARNING,"chunkserver has nonexistent chunk (%016"PRIX64"_%08"PRIX32"), so create it for future deletion",chunkid,version); syslog Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: main master server module: (ip:192.168.1.46) write error: EPIPE (Broken pipe) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.82 / port: 9422, usedspace: 4585512808448 (4270.59 GiB), totalspace: 12073950502912 (11244.74 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.80 / port: 9422, usedspace: 852051292160 (793.53 GiB), totalspace: 2219235217408 (2066.82 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.83 / port: 9422, usedspace: 4585739321344 (4270.80 GiB), totalspace: 12073407995904 (11244.24 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.79 / port: 9422, usedspace: 863847518208 (804.52 GiB), totalspace: 2249241448448 (2094.77 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.81 / port: 9422, usedspace: 860428738560 (801.34 GiB), totalspace: 2240201351168 (2086.35 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.84 / port: 9422, usedspace: 4586834935808 (4271.82 GiB), totalspace: 12073407995904 (11244.24 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.82 / port: 9422, usedspace: 4585512808448 (4270.59 GiB), totalspace: 12073950502912 (11244.74 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver disconnected - ip: 192.168.1.78 / port: 9422, usedspace: 864117645312 (804.77 GiB), totalspace: 2249241448448 (2094.77 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.84:9422,7) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.84 / port: 9422, usedspace: 4586834935808 (4271.82 GiB), totalspace: 12073407995904 (11244.24 GiB) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.81:9422,4) Dec 9 08:01:36 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.81 / port: 9422, usedspace: 860428738560 (801.34 GiB), totalspace: 2240201351168 (2086.35 GiB) Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.79:9422,2) Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.79 / port: 9422, usedspace: 863847518208 (804.52 GiB), totalspace: 2249241448448 (2094.77 GiB) Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.80:9422,3) Dec 9 08:01:37 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.80 / port: 9422, usedspace: 852052361216 (793.54 GiB), totalspace: 2219235217408 (2066.82 GiB) Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.83:9422,6) Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.83 / port: 9422, usedspace: 4585739321344 (4270.80 GiB), totalspace: 12073407995904 (11244.24 GiB) Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.78:9422,1) Dec 9 08:01:38 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.78 / port: 9422, usedspace: 864117645312 (804.77 GiB), totalspace: 2249241448448 (2094.77 GiB) Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: child finished Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5) Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: chunkserver register begin (packet version: 6) - ip: 192.168.1.82 / port: 9422, usedspace: 4585512808448 (4270.59 GiB), totalspace: 12073950502912 (11244.74 GiB) Dec 9 08:01:40 mfsmaster1 mfsmaster[18477]: store process has finished - store time: 99.380 Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.78 / port: 9422 has been fully removed from data structures Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.82 / port: 9422 has been fully removed from data structures Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.84 / port: 9422 has been fully removed from data structures Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.81 / port: 9422 has been fully removed from data structures Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.79 / port: 9422 has been fully removed from data structures Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.83 / port: 9422 has been fully removed from data structures Dec 9 08:02:18 mfsmaster1 mfsmaster[18477]: server ip: 192.168.1.80 / port: 9422 has been fully removed from data structures Dec 9 08:02:20 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.79 / port: 9422 Dec 9 08:02:32 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.78 / port: 9422 Dec 9 08:02:32 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.80 / port: 9422 Dec 9 08:02:32 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.81 / port: 9422 Dec 9 08:03:17 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.83 / port: 9422 Dec 9 08:03:19 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.84 / port: 9422 Dec 9 08:03:20 mfsmaster1 mfsmaster[18477]: chunkserver register end (packet version: 6) - ip: 192.168.1.82 / port: 9422 Dec 9 08:26:01 mfsmaster1 mfsmaster[18477]: structure check loop 2015-12-09 方垚| (8610) 62368638-8906 发件人: Aleksander Wieliczko 发送时间: 2015-12-08 18:52:46 收件人: fangyao; moosefs-users 抄送: 主题: Re: [MooseFS-Users] del and add Hi. Would you be so kind and send us some more details from syslog and tell us what MooseFS version you have? This is to small amount of information to draw some conclusions. Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com On 12/08/2015 11:06 AM, fangyao wrote: [root@mfsmaster1 data]# grep 63946327 changelog.*mfs changelog.11.mfs:18378184425: 1449525706|CHUNKDEL(63946327,1) changelog.11.mfs:18378260221: 1449525792|CHUNKADD(63946327,1,1450130592) changelog.21.mfs:18377349098: 1449489681|CHUNKDEL(63946327,1) changelog.21.mfs:18377425708: 1449489774|CHUNKADD(63946327,1,1450094574) changelog.28.mfs:18376418578: 1449464464|CHUNKDEL(63946327,1) changelog.28.mfs:18376495479: 1449464550|CHUNKADD(63946327,1,1450069350) changelog.35.mfs:18375745795: 1449439281|CHUNKDEL(63946327,1) changelog.35.mfs:18375823198: 1449439371|CHUNKADD(63946327,1,1450044171) changelog.37.mfs:18375586289: 1449432091|CHUNKDEL(63946327,1) changelog.37.mfs:18375662761: 1449432181|CHUNKADD(63946327,1,1450036981) changelog.3.mfs:18378924313: 1449554475|CHUNKDEL(63946327,1) changelog.3.mfs:18379012240: 1449554572|CHUNKADD(63946327,1,1450159372) changelog.45.mfs:18374453427: 1449403276|CHUNKDEL(63946327,1) changelog.45.mfs:18374529796: 1449403358|CHUNKADD(63946327,1,1450008158) changelog.6.mfs:18378583919: 1449543696|CHUNKDEL(63946327,1) changelog.6.mfs:18378669372: 1449543789|CHUNKADD(63946327,1,1450148589) this log be write when EPIPE (Broken pipe) and monitor locked unused file Ascending never del almost EPIPE when dump and iowait not high per second 15 we want solve problem EPIPE thx ------------------------------------------------------------------------------ Go from Idea to Many App Stores Faster with Intel(R) XDK Give your users amazing mobile app experiences with Intel(R) XDK. Use one codebase in this all-in-one HTML5 development environment. Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140 _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |