You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Robert <rsa...@ne...> - 2011-06-29 20:33:14
|
Yes, it is rather disappointing how Linux slows down on large folders.Not only that but some file systems like ext2/3/4 limits the number of files in a folder to a low number (16k/16k/64k). Generally we store files in a hash based structure where the first n characters of the files hash is the first folder name. The next m characters of the files hash is the second folder name and then the filename is the hash of the file. This prevents us from having a too large number of files in a folder, avoids the number of files per folder limits and slowdowns and prevents a too deep directory structure. It also gives us some deduplication. Yes, we use Centos, but installing and using the ktune package generally resolves most of the performance issues and differences I have seen with Ubuntu/Debian. It also helps to disable the cpuspeed daemon if you know the server does not have much down time. I don't understand the comment on hitting metadata a lot? What is a lot? Why would it make a difference? All the metadata is in RAM anyway? The biggest limit to speed seems to be the number of IOPS that you can get out of your disks you have available to you. Looking up the metadata from RAM should be several orders of magnitude faster than that. The activity reported through the CGI interface on the master is around 2,400 opens per minute average. Reads and writes are also around 2400 per minute alternating with each other. mknod has some peaks around 2,800 per minute but is generally much lower. Lookup's are around 8,000 per minute and getattr is around 700 per minute. Chunk replication and deletion is around 50 per minute. The other numbers are generally very low. Is there a guide/hints specific to MooseFS on what IO/Net/Process parameters would be good to investigate for mfsmaster? Robert -----Original Message----- From: Robert Dye <ro...@in...> To: 'Robert' <rsa...@ne...>; moosefs-users <moo...@li...> Sent: Wed, Jun 29, 2011 3:00 pm Subject: RE: [Moosefs-users] Write starvation As Ricardo mentioned below, you could behitting the metadata, a lot. When I ran benchmarks for enumerations in adirectory with more than 10,000 files, it was an obvious slow down incomparison to a directory with only a few files. Not sure which OS you are running, but Iwould tweak the many IO/Net/Process parameters. From personal experience, Ihad a huge slow down with mfsmasterserver when running on CentOS. I have sincemoved to Ubuntu, which appears to be faster out of the box. -Rob From: Robert[mailto:rsa...@ne...] Sent: Wednesday, June 29, 201111:29 AM To: ric...@da...;moo...@li... Subject: Re: [Moosefs-users] Writestarvation Yes,the master server is working hard. But I would still expect a somewhat fairdistribution of load between read and write. The specs: 2 x quad core Xeon E5405 @ 2GHz. 64 GB of RAM 32 x 2 TB 7200 RPM SATA disks 68 million file system objects 65.4 million files No swap is being used mfsmaster is using 23 GB of RAM. Robert -----OriginalMessage----- From: Ricardo J. Barberis <ric...@da...> To: moosefs-users <moo...@li...> Sent: Wed, Jun 29, 2011 12:08 pm Subject: Re: [Moosefs-users] Write starvation El Martes 28 Junio 2011, Robert Sandilands escribió: > Write traffic does not stop completely, but seems to slow down to < 10 > kB per second under high read traffic conditions. When the read traffic > decreases the write traffic will increase to normal levels. > > Is this a known problem? Is there something I can do to ensure that > write traffic is not starved by read traffic? > > Robert What are the master machine specs, and how many files do you have already on MFS? You might be hiting the master too hard with metadata. Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Zachary A W. <zw...@ii...> - 2011-06-29 19:11:14
|
So after successfully building a test network last week for my professor (consisting of 1 master, 1 meta, 2 chunkservers, and 1 client), I arrived this week with the machines turned off. After restarting, I started the machines in order of master, meta, chunk1, chunk2, client. However, now the client will not mount to the master no matter what I do. Has anybody ever had this problem before? Is it something I need to change on my MooseFS machines or do you think it may have to do with my network? FYI I am very new to Linux and extremely new to MooseFS. Thank you, Zach |
From: Robert D. <ro...@in...> - 2011-06-29 19:00:33
|
As Ricardo mentioned below, you could be hitting the metadata, a lot. When I ran benchmarks for enumerations in a directory with more than 10,000 files, it was an obvious slow down in comparison to a directory with only a few files. Not sure which OS you are running, but I would tweak the many IO/Net/Process parameters. From personal experience, I had a huge slow down with mfsmasterserver when running on CentOS. I have since moved to Ubuntu, which appears to be faster out of the box. -Rob _____ From: Robert [mailto:rsa...@ne...] Sent: Wednesday, June 29, 2011 11:29 AM To: ric...@da...; moo...@li... Subject: Re: [Moosefs-users] Write starvation Yes, the master server is working hard. But I would still expect a somewhat fair distribution of load between read and write. The specs: 2 x quad core Xeon E5405 @ 2GHz. 64 GB of RAM 32 x 2 TB 7200 RPM SATA disks 68 million file system objects 65.4 million files No swap is being used mfsmaster is using 23 GB of RAM. Robert -----Original Message----- From: Ricardo J. Barberis <ric...@da...> To: moosefs-users <moo...@li...> Sent: Wed, Jun 29, 2011 12:08 pm Subject: Re: [Moosefs-users] Write starvation El Martes 28 Junio 2011, Robert Sandilands escribió: > Write traffic does not stop completely, but seems to slow down to < 10 > kB per second under high read traffic conditions. When the read traffic > decreases the write traffic will increase to normal levels. > > Is this a known problem? Is there something I can do to ensure that > write traffic is not starved by read traffic? > > Robert What are the master machine specs, and how many files do you have already on MFS? You might be hiting the master too hard with metadata. Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! ---------------------------------------------------------------------------- -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Robert <rsa...@ne...> - 2011-06-29 18:45:25
|
Yes, the master server is working hard. But I would still expect a somewhat fair distribution of load between read and write. The specs: 2 x quad core Xeon E5405 @ 2GHz. 64 GB of RAM 32 x 2 TB 7200 RPM SATA disks 68 million file system objects 65.4 million files No swap is being used mfsmaster is using 23 GB of RAM. Robert -----Original Message----- From: Ricardo J. Barberis <ric...@da...> To: moosefs-users <moo...@li...> Sent: Wed, Jun 29, 2011 12:08 pm Subject: Re: [Moosefs-users] Write starvation El Martes 28 Junio 2011, Robert Sandilands escribió: > Write traffic does not stop completely, but seems to slow down to < 10 > kB per second under high read traffic conditions. When the read traffic > decreases the write traffic will increase to normal levels. > > Is this a known problem? Is there something I can do to ensure that > write traffic is not starved by read traffic? > > Robert What are the master machine specs, and how many files do you have already on MFS? You might be hiting the master too hard with metadata. Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Sébastien M. <seb...@gm...> - 2011-06-29 18:35:22
|
Hi, I'm currently interesting in the deployment of a distribuated filesystem and read a paper about moosefs. I have a few questions before starting the job: 1/ Why moosefs instead of glusterfs or xtreemfs? 2/ glusterfs has been described in a paper I read from a french university as really faster than moosefs, did you benchmark them too? 3/ Why fuse and no kernel mode (should be faster)? Thanks by advance, Sébastien |
From: Ricardo J. B. <ric...@da...> - 2011-06-29 16:08:48
|
El Martes 28 Junio 2011, Robert Sandilands escribió: > Write traffic does not stop completely, but seems to slow down to < 10 > kB per second under high read traffic conditions. When the read traffic > decreases the write traffic will increase to normal levels. > > Is this a known problem? Is there something I can do to ensure that > write traffic is not starved by read traffic? > > Robert What are the master machine specs, and how many files do you have already on MFS? You might be hiting the master too hard with metadata. Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! |
From: Ricardo J. B. <ric...@da...> - 2011-06-29 16:03:58
|
El Miércoles 29 Junio 2011, youngcow escribió: > I think you can use raid technology to protect drives failed on one > chunkserver. Yes, but as Laurent said you (still) loose fault tolerance. Disks aren't the only components that can fail, you also have to take consideration power supply, memory, processor, motherboard, etc. So I guess it depends on how critical the data you put in mfs is, or as I read some time ago: "<Dr_Memory> or to put it another way: your kitten photos do not need the same high-availabity system infrastructure as Citibank's transaction databases :) <topaz> I CAN HAS FIEV NIENS?" :) > > I have a sense that this question is due to confusing drives and > > chunkservers. MFS only replicates across chunkservers. So, if you have > > multiple drives attached to a single chunkserver, MFS wont replicate > > inside of the chunkserver. > > > > this brings up my question - can one safely run multiple chunkserver > > processes on a single machine? > > > > max Cheers, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! |
From: Laurent W. <lw...@hy...> - 2011-06-29 14:13:54
|
On Wed, 29 Jun 2011 01:33:35 +0000 (GMT) Zachary A Wagner <zw...@ii...> wrote: > Hello, I am fairly new to Linux and extremely new to MooseFS. I am trying to set up a testing network of 5 systems for my professor: 1 Master, 1 Metalogger, 2 Chunkservers (each with 2 mounted partitions), and 1 client. > Everything seemed to go well (no errors, all the mfs machines started without error) until I neared the end of Client setup in your step-by-step guide. After I entered the "df -h | grep mfs" command, I received data for the > master but not for the chunkservers' partitions (which are named mfschunks11, mfschunks12, mfschunks21, and mfschunks22). I was wondering if this was normal? If not, do you know anything that may be wrong or that > might help me? It is perfectly normal. you get information for the mfs volume only on the client. disk space information of the chunkserver is not for users. If you need information about disk usage on a chunkserver, then df -h being on that chunkserver. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Laurent W. <lw...@hy...> - 2011-06-29 13:44:05
|
On Wed, 29 Jun 2011 16:24:37 +0300 Stas Oskin <sta...@gm...> wrote: > Hi. > > What is the best way to retire a chunkserver? > > Is there a command that can do this? Or I just can stop mfschunkserver, and > let MFS resync the chunks to other chunkservers in cluster? best thing is to add a * in front of each mount point used by mfs on the chunkserver. Restart the CS, and let the replication do the job, then retire the CS. That way, you'll never be in a situation where goal is lower than it should be. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Alexander A. <akh...@ri...> - 2011-06-29 13:41:08
|
Hi Stas! Please see this http://www.moosefs.org/moosefs-faq.html#add_remove wbr Alexander ====================================================== Hi. What is the best way to retire a chunkserver? Is there a command that can do this? Or I just can stop mfschunkserver, and let MFS resync the chunks to other chunkservers in cluster? Thanks. |
From: Stas O. <sta...@gm...> - 2011-06-29 13:24:57
|
Hi. What is the best way to retire a chunkserver? Is there a command that can do this? Or I just can stop mfschunkserver, and let MFS resync the chunks to other chunkservers in cluster? Thanks. |
From: jyc <mai...@gm...> - 2011-06-29 13:02:39
|
I reply to myself. did a memtest on the server. the bios was ECC enabled, but the memory was not capable. memtest return once a mistake. idem est : bad crc. my conclusion : don't test moosefs on too old hardware :-) jyc wrote: > Hi everyone, > > i have a cluster with 4 chunkservers with moosefs. > on one server, i got this problem : > > i had many filesystem check info with no problem. > but now, i get one chunk with a goal of one, and a valid copie of zero. > > in the log, I get : > > but now Jun 28 14:53:39 read_block_from_chunk: > file:/opt/ba1d3/mfs/2B/chunk_000000000000962B_00000001.mfs - crc error > > but the file was written five days before : > > mfs mfs 65M jun 23 03:04 > /opt/ba1d3/mfs/2B/chunk_000000000000962B_00000001.mfs > > can you explain me why this chunk was correct for five days, and now it > seems that it is not ? > > by the way, what process do the crc ? > is it the mfschunkserver that check it (I think it must be) where is > stored the "correct" crc ? in the mfsmaster process/file ? > > (on this server, i got 4 hdds, all with xfs filesystem. i changed the > motherboard. system is still the same after changing the motherboard. > kernel 2.6.26-2-686 ) > > any clue ? > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: youngcow <you...@gm...> - 2011-06-29 12:27:07
|
I think you can use raid technology to protect drives failed on one chunkserver. > I have a sense that this question is due to confusing drives and chunkservers. MFS only replicates across chunkservers. So, if you have multiple drives attached to a single chunkserver, MFS wont replicate inside of the chunkserver. > > this brings up my question - can one safely run multiple chunkserver processes on a single machine? > > max > > On Jun 14, 2011, at 4:26 PM, Laurent Wandrebeck wrote: > >> On Mon, 13 Jun 2011 17:20:39 -0700 >> Howie Chen<ho...@he...> wrote: >> >>> Dear MFS users, >> Hello, >>> >>> How come my moosefs cannot replicate data automatically? Where to turn on this option? >> Either you're joking, or you forgot to set goal>1 (mfssetgoal), or you >> didn't wait enough once goal has been set ? It can take a couple >> minutes. >> Oh, and if you have a single chunkserver, a goal>1 isn't taken care >> of, because it has absolutely no use. >> Hope it helps, >> -- >> Laurent Wandrebeck >> HYGEOS, Earth Observation Department / Observation de la Terre >> Euratechnologies >> 165 Avenue de Bretagne >> 59000 Lille, France >> tel: +33 3 20 08 24 98 >> http://www.hygeos.com >> GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C >> D17C F64C >> ------------------------------------------------------------------------------ >> EditLive Enterprise is the world's most technically advanced content >> authoring tool. Experience the power of Track Changes, Inline Image >> Editing and ensure content is compliant with Accessibility Checking. >> http://p.sf.net/sfu/ephox-dev2dev_______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Laurent W. <lw...@hy...> - 2011-06-29 08:31:01
|
On Wed, 29 Jun 2011 16:07:49 +0800 Max Cantor <mxc...@gm...> wrote: > I have a sense that this question is due to confusing drives and chunkservers. MFS only replicates across chunkservers. So, if you have multiple drives attached to a single chunkserver, MFS wont replicate inside of the chunkserver. Agreed. > > this brings up my question - can one safely run multiple chunkserver processes on a single machine? > > max Provided you take care of port numbers, and partitions used by different CS processes yes. But you lose fault tolerance. A failing box will take down several CS…which is pretty bad. You definitely don't want to do that :) HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Max C. <mxc...@gm...> - 2011-06-29 08:08:01
|
I have a sense that this question is due to confusing drives and chunkservers. MFS only replicates across chunkservers. So, if you have multiple drives attached to a single chunkserver, MFS wont replicate inside of the chunkserver. this brings up my question - can one safely run multiple chunkserver processes on a single machine? max On Jun 14, 2011, at 4:26 PM, Laurent Wandrebeck wrote: > On Mon, 13 Jun 2011 17:20:39 -0700 > Howie Chen <ho...@he...> wrote: > >> Dear MFS users, > Hello, >> >> How come my moosefs cannot replicate data automatically? Where to turn on this option? > Either you're joking, or you forgot to set goal >1 (mfssetgoal), or you > didn't wait enough once goal has been set ? It can take a couple > minutes. > Oh, and if you have a single chunkserver, a goal >1 isn't taken care > of, because it has absolutely no use. > Hope it helps, > -- > Laurent Wandrebeck > HYGEOS, Earth Observation Department / Observation de la Terre > Euratechnologies > 165 Avenue de Bretagne > 59000 Lille, France > tel: +33 3 20 08 24 98 > http://www.hygeos.com > GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C > D17C F64C > ------------------------------------------------------------------------------ > EditLive Enterprise is the world's most technically advanced content > authoring tool. Experience the power of Track Changes, Inline Image > Editing and ensure content is compliant with Accessibility Checking. > http://p.sf.net/sfu/ephox-dev2dev_______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Robert S. <rsa...@ne...> - 2011-06-29 03:01:17
|
I have been seeing crashes of mfsmaster every few days. Generally it happens within a few minutes of the hour. For example 8:02 or 7:02. It has happened on June the 9th, 16th and 26th either around 7 AM or 8 AM. I am using mfs 1.6.20 on Centos 5.6 64-bit. Currently there are only 2 chunk servers and 2 loggers. The one chunk server is also the master. Both chunk servers also mount the volume locally and serve the content through Apache and DNS based load balancing. The volumes are also mounted by other machines which may re-export it using samba or use it locally. Any ideas of what I can do to troubleshoot/prevent this? Some of the logs before the crash: Jun 26 08:02:37 master mfsmaster[12670]: connection with client(ip:127.0.0.1) has been closed by peer Jun 26 08:02:39 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.14) has been closed by peer Jun 26 08:02:43 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.139) has been closed by peer Jun 26 08:02:56 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.15) has been closed by peer Jun 26 08:02:56 master mfsmaster[12670]: got chunk status, but don't want it Jun 26 08:02:56 master mfsmaster[12670]: connection with CS(xxx.xxx.x.55) has been closed by peer Jun 26 08:02:56 master mfsmaster[12670]: chunkserver disconnected - ip: xxx.xxx.x.55, port: 9422, usedspace: 21515155550208 (20037.55 GiB), totalspace: 25999924264960 (24214.32 GiB) Jun 26 08:03:02 master mfsmaster[12670]: connection with ML(xxx.xxx.x.139) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: connection with ML(xxx.xxx.x.14) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.139) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: connection with ML(xxx.xxx.x.139) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.15) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: chunkserver register begin (packet version: 5) - ip: xxx.xxx.x.55, port: 9422 Jun 26 08:03:02 master mfsmaster[12670]: connection with ML(xxx.xxx.x.14) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.14) has been closed by peer Jun 26 08:03:02 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.139) has been closed by peer Jun 26 08:03:37 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.14) has been closed by peer Jun 26 08:03:37 master mfsmaster[12670]: connection with CS(xxx.xxx.x.55) has been closed by peer Jun 26 08:03:37 master mfsmaster[12670]: chunkserver disconnected - ip: xxx.xxx.x.55, port: 9422, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 26 08:03:39 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.15) has been closed by peer Jun 26 08:03:40 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.139) has been closed by peer Jun 26 08:03:40 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.14) has been closed by peer Jun 26 08:03:40 master mfsmaster[12670]: connection with client(ip:xxx.xxx.x.15) has been closed by peer Robert |
From: Robert S. <rsa...@ne...> - 2011-06-29 02:30:32
|
I have been moving data from existing non-distributed file systems onto a MooseFS file system. I am using 1.6.20 on Centos 5.6. While moving the data I have also transfered some of the normal read-only traffic load to use the data already moved onto the MFS volume. What I can see is when there is any significant read traffic then write traffic slows down to a crawl. When I look at the server charts for any of the chunk servers generated by mfscgiserv then it seems like read and write traffic seems to alternate. Write traffic does not stop completely, but seems to slow down to < 10 kB per second under high read traffic conditions. When the read traffic decreases the write traffic will increase to normal levels. Is this a known problem? Is there something I can do to ensure that write traffic is not starved by read traffic? Robert |
From: Zachary A W. <zw...@ii...> - 2011-06-29 01:46:15
|
Hello, I am fairly new to Linux and extremely new to MooseFS. I am trying to set up a testing network of 5 systems for my professor: 1 Master, 1 Metalogger, 2 Chunkservers (each with 2 mounted partitions), and 1 client. Everything seemed to go well (no errors, all the mfs machines started without error) until I neared the end of Client setup in your step-by-step guide. After I entered the "df -h | grep mfs" command, I received data for the master but not for the chunkservers' partitions (which are named mfschunks11, mfschunks12, mfschunks21, and mfschunks22). I was wondering if this was normal? If not, do you know anything that may be wrong or that might help me? Thank you, Zach |
From: jyc <mai...@gm...> - 2011-06-28 14:57:57
|
Hi everyone, i have a cluster with 4 chunkservers with moosefs. on one server, i got this problem : i had many filesystem check info with no problem. but now, i get one chunk with a goal of one, and a valid copie of zero. in the log, I get : but now Jun 28 14:53:39 read_block_from_chunk: file:/opt/ba1d3/mfs/2B/chunk_000000000000962B_00000001.mfs - crc error but the file was written five days before : mfs mfs 65M jun 23 03:04 /opt/ba1d3/mfs/2B/chunk_000000000000962B_00000001.mfs can you explain me why this chunk was correct for five days, and now it seems that it is not ? by the way, what process do the crc ? is it the mfschunkserver that check it (I think it must be) where is stored the "correct" crc ? in the mfsmaster process/file ? (on this server, i got 4 hdds, all with xfs filesystem. i changed the motherboard. system is still the same after changing the motherboard. kernel 2.6.26-2-686 ) any clue ? |
From: Ólafur Ó. <osv...@ne...> - 2011-06-28 11:14:57
|
Hi, I agree it seems to have no effect on anything else than mfsexports.cfg reloading, we are running Xen VM images on the MFS partition and I'm not sure how they will handle the mfsmaster restarting, will have to setup a test enviroment and test it. When reading the man page for mfsmaster.cfg I see the comments for CHUNKS_LOOP_TIME and CHUNKS_DEL_LIMIT and my understanding is that with the default values the maximum number of chunks to delete in one loop (300 sek) is 100, it does not say if that is pr. chunkserver or for the whole system, but each server here was around and over 5000 chunk deletions pr. minute and with 10 servers thats over 50k chunk deletions pr. minute for the whole system. CHUNKS_LOOP_TIME Chunks loop frequency in seconds (default is 300) CHUNKS_DEL_LIMIT Maximum number of chunks to delete in one loop (default is 100) CHUNKS_WRITE_REP_LIMIT Maximum number of chunks to replicate to one chunkserver in one loop (default is 1) CHUNKS_READ_REP_LIMIT Maximum number of chunks to replicate from one chunkserver in one loop (default is 5) Deleteing 100 and only replicating 1, thats quite a difference. Stil puzzled by this. /Oli On 28.6.2011, at 10:54, rxknhe wrote: I am not sure if SIGHUP is going to work. It works for few things like re-reading mfsexports.cfg file, but we also noticed that it won't work when changing parameters in mfsmaster.cfg and one has to restart master server process. Although we found that restarting master server process is normally safe, but I agree that better way is to handle via SIGHUP as it is a little scary thought to restart master in a production system. 2011/6/28 Ólafur Ósvaldsson <osv...@ne...<mailto:osv...@ne...>> Hi When it was at its peak I tried changing that value down to 10 and ultimately to 1 and sending the mfsmaster a SIGHUP, but nothing changed, for us restarting the master is not an option since all the systems were still getting service although very slow. /Oli On 27.6.2011, at 21:52, rxknhe wrote: On master may be try tweaking option in mfsmaster.cfg file and restart master server process. # CHUNKS_DEL_LIMIT =100 un-comment and try lower values, 2011/6/27 Ólafur Ósvaldsson <osv...@ne...<mailto:osv...@ne...>> Hi, Our system had just over 4.5 million chunks untill one week ago when a good deal of the data was deleted (on purpose). Today the trash time expired and it seems that our master is deleting all the unused chunks which is quite normal except each server of 10 is doing up to 5.5k chunk deletions pr. minute and all systems accessing the MFS partitions are very slow and that started at the same time as the chunk deletions. Is there any way to tune MFS so that chunk deletions don't have such an impact on the system, can we somehow control how many chunks are deleted at once in max rate pr. minute or maybe have it somehow linked to chunkserver/master load? The network links are not saturated and none of the servers seem to be loaded at all? The disk io on the master even went down when this started, as did the CPU usage of the server, but the memory usage has gone up about 30% Could it be that the master process is not handling all this at once? /Oli -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne...<mailto:osv...@ne...> phone: +354 517 3400<tel:%2B354%20517%203400> ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne...<mailto:osv...@ne...> phone: +354 517 3400<tel:%2B354%20517%203400> -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne...<mailto:osv...@ne...> phone: +354 517 3400 |
From: Ólafur Ó. <osv...@ne...> - 2011-06-28 08:27:54
|
Hi When it was at its peak I tried changing that value down to 10 and ultimately to 1 and sending the mfsmaster a SIGHUP, but nothing changed, for us restarting the master is not an option since all the systems were still getting service although very slow. /Oli On 27.6.2011, at 21:52, rxknhe wrote: On master may be try tweaking option in mfsmaster.cfg file and restart master server process. # CHUNKS_DEL_LIMIT =100 un-comment and try lower values, 2011/6/27 Ólafur Ósvaldsson <osv...@ne...<mailto:osv...@ne...>> Hi, Our system had just over 4.5 million chunks untill one week ago when a good deal of the data was deleted (on purpose). Today the trash time expired and it seems that our master is deleting all the unused chunks which is quite normal except each server of 10 is doing up to 5.5k chunk deletions pr. minute and all systems accessing the MFS partitions are very slow and that started at the same time as the chunk deletions. Is there any way to tune MFS so that chunk deletions don't have such an impact on the system, can we somehow control how many chunks are deleted at once in max rate pr. minute or maybe have it somehow linked to chunkserver/master load? The network links are not saturated and none of the servers seem to be loaded at all? The disk io on the master even went down when this started, as did the CPU usage of the server, but the memory usage has gone up about 30% Could it be that the master process is not handling all this at once? /Oli -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne...<mailto:osv...@ne...> phone: +354 517 3400<tel:%2B354%20517%203400> ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne...<mailto:osv...@ne...> phone: +354 517 3400 |
From: Stas O. <sta...@gm...> - 2011-06-27 23:23:48
|
More on this - every time running mount -a, I see there are 2 processes: 14630 ? S<sl 0:00 mfsmount /mnt/mfs -o rw,mfsmaster=master1 14801 pts/0 S<l 0:00 mfsmount /mnt/mfs -o rw,mfsmaster=master1 Then after some time, the 2nd process stops - probably due to below error. Any idea why mfsmount tries to re-mount itself on every mount -a request? Thanks. On Tue, Jun 28, 2011 at 2:10 AM, Stas Oskin <sta...@gm...> wrote: > This appears when mounting manually: > > fuse: mountpoint is not empty > fuse: if you are sure this is safe, use the 'nonempty' mount option > error in fuse_mount > > But the mounting still succeeds, and I can browse files on cluster. > > Any idea if this related? > > > On Mon, Jun 27, 2011 at 10:48 PM, Stas Oskin <sta...@gm...> wrote: > >> This is the backtrace: >> >> #0 0x0000003cd8a30265 in raise () from /lib64/libc.so.6 >> #1 0x0000003cd8a31d10 in abort () from /lib64/libc.so.6 >> #2 0x0000003cd8a6a84b in __libc_message () from /lib64/libc.so.6 >> #3 0x0000003cd8a7230f in _int_free () from /lib64/libc.so.6 >> #4 0x0000003cd8a7276b in free () from /lib64/libc.so.6 >> #5 0x000000000040eaba in fuse_reply_entry () >> #6 0x00000000004121c9 in fuse_reply_entry () >> #7 0x0000000000412580 in fuse_reply_entry () >> #8 0x0000003cd8a1d994 in __libc_start_main () from /lib64/libc.so.6 >> #9 0x0000000000402519 in fuse_reply_entry () >> #10 0x00007fff48cb59d8 in ?? () >> #11 0x0000000000000000 in ?? () >> (gdb) >> >> On Mon, Jun 27, 2011 at 10:41 PM, Stas Oskin <sta...@gm...>wrote: >> >>> Hi. >>> >>> We found an issue where mfsmount, when mounted via the autofs script, is >>> segfaults. >>> >>> 1) Is this a known issue? >>> >>> 2) What are other alternatives of mounting mfsmount? >>> >>> Thanks! >>> >> >> > |
From: Stas O. <sta...@gm...> - 2011-06-27 23:10:26
|
This appears when mounting manually: fuse: mountpoint is not empty fuse: if you are sure this is safe, use the 'nonempty' mount option error in fuse_mount But the mounting still succeeds, and I can browse files on cluster. Any idea if this related? On Mon, Jun 27, 2011 at 10:48 PM, Stas Oskin <sta...@gm...> wrote: > This is the backtrace: > > #0 0x0000003cd8a30265 in raise () from /lib64/libc.so.6 > #1 0x0000003cd8a31d10 in abort () from /lib64/libc.so.6 > #2 0x0000003cd8a6a84b in __libc_message () from /lib64/libc.so.6 > #3 0x0000003cd8a7230f in _int_free () from /lib64/libc.so.6 > #4 0x0000003cd8a7276b in free () from /lib64/libc.so.6 > #5 0x000000000040eaba in fuse_reply_entry () > #6 0x00000000004121c9 in fuse_reply_entry () > #7 0x0000000000412580 in fuse_reply_entry () > #8 0x0000003cd8a1d994 in __libc_start_main () from /lib64/libc.so.6 > #9 0x0000000000402519 in fuse_reply_entry () > #10 0x00007fff48cb59d8 in ?? () > #11 0x0000000000000000 in ?? () > (gdb) > > On Mon, Jun 27, 2011 at 10:41 PM, Stas Oskin <sta...@gm...> wrote: > >> Hi. >> >> We found an issue where mfsmount, when mounted via the autofs script, is >> segfaults. >> >> 1) Is this a known issue? >> >> 2) What are other alternatives of mounting mfsmount? >> >> Thanks! >> > > |
From: Stas O. <sta...@gm...> - 2011-06-27 19:48:45
|
This is the backtrace: #0 0x0000003cd8a30265 in raise () from /lib64/libc.so.6 #1 0x0000003cd8a31d10 in abort () from /lib64/libc.so.6 #2 0x0000003cd8a6a84b in __libc_message () from /lib64/libc.so.6 #3 0x0000003cd8a7230f in _int_free () from /lib64/libc.so.6 #4 0x0000003cd8a7276b in free () from /lib64/libc.so.6 #5 0x000000000040eaba in fuse_reply_entry () #6 0x00000000004121c9 in fuse_reply_entry () #7 0x0000000000412580 in fuse_reply_entry () #8 0x0000003cd8a1d994 in __libc_start_main () from /lib64/libc.so.6 #9 0x0000000000402519 in fuse_reply_entry () #10 0x00007fff48cb59d8 in ?? () #11 0x0000000000000000 in ?? () (gdb) On Mon, Jun 27, 2011 at 10:41 PM, Stas Oskin <sta...@gm...> wrote: > Hi. > > We found an issue where mfsmount, when mounted via the autofs script, is > segfaults. > > 1) Is this a known issue? > > 2) What are other alternatives of mounting mfsmount? > > Thanks! > |
From: Stas O. <sta...@gm...> - 2011-06-27 19:41:52
|
Hi. We found an issue where mfsmount, when mounted via the autofs script, is segfaults. 1) Is this a known issue? 2) What are other alternatives of mounting mfsmount? Thanks! |