From: Ryan K. <li...@rk...> - 2003-10-21 17:20:38
|
I have set up UML 2.4.22-5um on two machines with identical software (Debian 3.0, kernel 2.4.22-ska3). I am using sparse files for the disk images used by the UML instances. UML boots and runs just fine, for the most part. When I am installing packages inside an UML, or copying large amounts of data into the UML (from the network), the entire UML instance has a tendency to lock up for a short amount of time. This also occurs if the host is busy with heavy disk access (resulting in any UML instance that is trying to write at the same time, locking up). The UML instance always recovers, and it primarily only occurs when the actual host disk usage by the sparse file has to be increased. I have seen some posts to this list that expanding the on disk usage of a sparse file can be time consuming. But the strange part of this problem is that on one machine, these lockups are less than a second (a few hundred milliseconds), and on the other machine they are several seconds (10-40 seconds). The former machine is a P2-400 w/256MB RAM, and 30GB IDE HD. The latter machine is a Athlon 1.4GHz w/1GB RAM, and 3x36GB Ultra160 RAID5 (software) SCSI array. So, yes, the far faster machine is showing these lockups much worse than the slower machine. :( This is not a show stopper for my use of UML, since most of the time the disk space usage by a UML instance is steady state. Only when more disk is needed, does it performance become "painful". I am curious why the faster machine has more problems than the slower machine, any one have any ideas? Thanks. PS. The faster machine is using high memory support in both the host and UML kernel, but disabling it in both makes no difference in behavior. --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | --------------------------------------------------------------------------- |
From: Luis R. G. C. <lrg...@in...> - 2003-10-24 01:46:01
|
On Tue, Oct 21, 2003 at 08:45:09AM -0600, Ryan Kirkpatrick wrote: > ... > I have seen some posts to this list that expanding the on disk > usage of a sparse file can be time consuming. But the strange part of this > problem is that on one machine, these lockups are less than a second (a > few hundred milliseconds), and on the other machine they are several > seconds (10-40 seconds). The former machine is a P2-400 w/256MB RAM, and > 30GB IDE HD. The latter machine is a Athlon 1.4GHz w/1GB RAM, and 3x36GB > Ultra160 RAID5 (software) SCSI array. So, yes, the far faster machine is > showing these lockups much worse than the slower machine. :( I've seen on many other lists reports about IDE disks seeming to be much faster than SCSI ones. It seems that they report success back to the OS as soon as the write is in their buffer, instead of waiting till it's in the disk. That may explain the difference in speed. -- Rodrigo Gallardo PGP Key ID: ADC9BC28 Fingerprint: 7C81 E60C 442E 8FBC D975 2F49 0199 8318 ADC9 BC28 |
From: Jeff D. <jd...@ad...> - 2003-10-25 01:30:03
|
li...@rk... said: > I have seen some posts to this list that expanding the on disk usage > of a sparse file can be time consuming. But the strange part of this > problem is that on one machine, these lockups are less than a second > (a few hundred milliseconds), and on the other machine they are > several seconds (10-40 seconds). The former machine is a P2-400 w/ > 256MB RAM, and 30GB IDE HD. The latter machine is a Athlon 1.4GHz w/ > 1GB RAM, and 3x36GB Ultra160 RAID5 (software) SCSI array. So, yes, the > far faster machine is showing these lockups much worse than the slower > machine. :( Is it just the UML that's wedged, or the whole box? If it's really many 10s of seconds, then some vmstat data would be interesting. Start it during this period and let it run for a little afterwards so that the difference is apparent. Jeff |
From: Ryan K. <li...@rk...> - 2003-10-27 16:03:35
|
On Fri, 24 Oct 2003, Jeff Dike wrote: > li...@rk... said: > > I have seen some posts to this list that expanding the on disk usage > > of a sparse file can be time consuming. But the strange part of this > > problem is that on one machine, these lockups are less than a second > > (a few hundred milliseconds), and on the other machine they are > > several seconds (10-40 seconds). The former machine is a P2-400 w/ > > 256MB RAM, and 30GB IDE HD. The latter machine is a Athlon 1.4GHz w/ > > 1GB RAM, and 3x36GB Ultra160 RAID5 (software) SCSI array. So, yes, the > > far faster machine is showing these lockups much worse than the slower > > machine. :( > > Is it just the UML that's wedged, or the whole box? Just the UML that is wedged, the host is just fine, as if nothing out of the ordinary was going on. > If it's really many 10s of seconds, then some vmstat data would be > interesting. Start it during this period and let it run for a little > afterwards so that the difference is apparent. Okay, attached are four files from my two different machines: vmstat-slow.host.log :: Host 'vmstat 1' log on the P2. vmstat-slow.uml.log :: UML 'vmstat 1' log on the P2. vmstat-fast.host.log :: Host 'vmstat 1' log on the Athlon. vmstat-fast.uml.log :: UML 'vmstat 1' log on the Athlon. For each pair of files, vmstat was started on both the host and UML at about the same time and allowed to run for ~30 seconds. Then a copy via netcat and afio of ~200MB of PDF files over a 100mbit network from a separate machine (not the UML host) to the UML was started. I am using bridging and tuntap for network access. The filesystem image being used is a sparse file, and it expanded by at least 100MB during the file copy. The only difference I can see between the two UML instances is that on the P2 the allocated memory for the UML is 64MB, while on the Athlon it is 128MB. The vmstat-fast.uml.log is quite interesting, as it is much, much shorter than the vmstat-fast.host.log. The reason it is shorter is due to the UML system being wedged for much of the copy. The P2 took only a minute to do this transfer, while the Athlon took nearly five minutes. Hopefully these logs are of use. If there are any other tests I can run, please let know. Thanks! --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | --------------------------------------------------------------------------- |
From: Ryan K. <li...@rk...> - 2003-11-30 21:33:47
|
On Tue, 21 Oct 2003, Ryan Kirkpatrick wrote: > When I am installing packages inside an UML, or copying large > amounts of data into the UML (from the network), the entire UML instance > has a tendency to lock up for a short amount of time. This also occurs if > the host is busy with heavy disk access (resulting in any UML instance > that is trying to write at the same time, locking up). The UML instance > always recovers. This is a follow-up to my original post for the email archives. I found that disabling the sync I/O option in the UML's kernel (CONFIG_BLK_DEV_UBD_SYNC) removed the hangups I was observering. During heavy disk writes, the UML no longer locks up, and response to interactive processes (i.e. shells) remains acceptable. Furthermore, I did a few quick tests with this option disabled of what happens if the UML is crashed during heavy disk writes (i.e. mconsole halt command). When using ext3 for the UML's filesystem, I saw no filesystem corruption and only very little data loss (which is to be expected). Therefore, I see no additional risk to data integrity by disabling this option, especially given the performance boost received. Thanks for everyone's help. PS. Though in all cases, if your data is important to you, you should make regular backups to some other location. :) --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | --------------------------------------------------------------------------- |