From: Erik W. <om...@vc...> - 2005-08-19 07:53:38
|
I've got a system I'm building for a school in a small isolated Andean town in Bolivia, and I'm leaving on Sept 4th to go install it. Problem is, it's currently crashing ;-( The machine is a basic Athlon 3000+ box with 6 (six) Radeon 7000 PCI video cards, and a whole mess of USB keyboards and mice. It's running Ubuntu 5.04 with xorg 6.8.2, and I've just tried xfree 4.3.0 with even worse results so far. Here are the tricks I've got in place currently: 1) At bootup, I have a script that starts each X server, one at a time, just long enough to fully initialize the video card. The xorg.conf.N file is pretty standard except for the evdev (USB) setup, and VGAAccess=false. SingleCard is explicitly set to *false*. GLX/DRI are removed because they don't work with multiple cards as far as I can tell so far (would love to solve that though). 2) Once this is done, gdm.conf has 6 lines as follows: /usr/X11R6/bin/X -br -audit 0 -xf86config /etc/X11/xorg.conf.0s :0 vt7 -sharevts The xorg.conf.Ns for this final configuration is identical except SingleCard is set to *true*. The initial startup without SingleCard seems to be necessary in order for the VGA BIOS of the non-primary cards to initialize properly. If I neglect to perform that step and try to start the SingleCard servers right after bootup, any server on any card *except* the primary (bootup console) crashes somewhere in the middle of executing the BIOS (strace shows vm86old() or whatever it is several thousand times, then it seizes). Now, once this is all up and running, it seems to operate perfectly. I can log into each head separately and each looks as if it were its own dedicated computer. That's the goal of the setup, obviously, as a single machine is more efficient in many respects (power, heat, cost, and most importantly in this case: shipping volume). The problem is that the machine will then seize up with no warning anywhere from ~5 to ~30 minutes after starting up GDM. This happens whether I am using one of the heads or not, and whether I've even *touched* the system or not. It just as regularly happens when I've just gone through the entire Ubuntu package list in Synaptic and am about to start downloading (sigh!) as when I'm not even in the same room and have started GDM remotely. There are *no* messages of any kind, *anywhere*. I'm attempting to set up a KGDB environment, but need to hunt down either a null modem cable or the kgdb-over-ethernet patches for gdb before I can begin. Even then I have had no experience actually debugging the kernel (have only done a bit of driver development without gdb) so I don't know what kind of luck I will have. Given the fact that this machine is basically worthless if it crashes once it's installed in Bolivia and I've come back home, I *really* need to try to solve this in the next 2 weeks. If not we'll be taking the gear but attempting to scrounge together as many single-headed machines as we can to make use of all the LCDs we're bringing with us. Lame, but better than nothing. ;-( I've tried messing with combinations of SingleCard, VGAAccess, NoInt10, DPMS=false, -novtswitch, etc. to no avail. I haven't exhaustively tried NoAccel, and need to start taking more thorough notes of the combinations and results, but otherwise I'm almost totally out of ideas. Re: DPMS it doesn't seem to have any relationship at all to shutdown or wakeup of the displays. The only time I got any hint of having made progress was when I added -novtswitch to gdm.conf. Running two heads the machine made it through several near-complete deb-builds of xfree86 without crashing. However, once the two displays went to sleep, I tied to wake them. The primary head came back, but the secondary showed no signs of life at all. Over ssh I killed the server (-TERM), and the machine promptly crashed as soon as gdm tried to start the replacement server. As mentioned above I tried xfree86 4.3.0-dfsg.1, after rebuilding it with the -sharevts and -novtswitch patches ported over from the xorg packages. With the same config files and commandlines, it won't even bring up two displays without crashing during initialization. I'll do a few more tests tomorrow to see if I've screwed a config somewhere compared to xorg, but I'm not very optimistic. ATI's proprietary driver isn't my favorite thing, but I'd put up with it if it solved the problem. Only thing is, it doesn't support anything older than the 9xxx series anyway... I'm in #xorg on freenode, but haven't gotten any significant suggestions so far, and it takes way too long to explain the issue ;-( If there are any other (active!) mailing lists I can send this to that might be relevant, I'd appreciate any suggestions. TIA, Omega aka Erik Walthinsen om...@vc... |
From: Hugo V. <hvw...@ya...> - 2005-08-19 09:58:17
|
--- Erik Walthinsen <om...@vc...> wrote: > I've got a system I'm building for a school in a > small isolated Andean > town in Bolivia, and I'm leaving on Sept 4th to go > install it. Problem > is, it's currently crashing ;-( > <snip> > > The problem is that the machine will then seize up > with no warning > anywhere from ~5 to ~30 minutes after starting up > GDM. <snip> Did you run this setup on a vanilla system (no patches) first and does it crash then? Sounds just like the system I got here. It crashes too the way you describe, but it is the new mobo + CPU I put in because the old mobo does not crash! Hugo ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs |
From: Erik W. <om...@vc...> - 2005-08-19 16:28:36
|
Hugo Vanwoerkom wrote: > Did you run this setup on a vanilla system (no > patches) first and does it crash then? Yup, when I run with just one head it's been rock solid for the duration of testing. I'll run it singleheaded all this weekend while I'm gone, and get someone to prod it periodically to make sure. |
From: Hugo V. <hvw...@ya...> - 2005-08-20 10:04:23
|
--- Erik Walthinsen <om...@vc...> wrote: > Hugo Vanwoerkom wrote: > > Did you run this setup on a vanilla system (no > > patches) first and does it crash then? > > Yup, when I run with just one head it's been rock > solid for the duration > of testing. I'll run it singleheaded all this > weekend while I'm gone, > and get someone to prod it periodically to make > sure. > When you say that it crashes with no signs of anything, I presume you mean that everything freezes and you cannot move anything? That is a kernel panic that most likely shows up on the AGP vc console, the one you boot with. Has that been active, i.e. visible during a crash? H ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs |
From: Erik W. <om...@vc...> - 2005-08-22 19:27:03
|
Hugo Vanwoerkom wrote: > When you say that it crashes with no signs of > anything, I presume you mean that everything freezes > and you cannot move anything? > That is a kernel panic that most likely shows up on > the AGP vc console, the one you boot with. > > Has that been active, i.e. visible during a crash? Not until now, I just ran a test with the 2nd and 3rd heads running X and the first head (all PCI) sitting at a text VT. Problem is, the text VT got throroughly hozed once the X servers started up, incapable of scrolling when e.g. trying to log in, and getting interleaved in weird patterns when trying to switch VTs. I may try putting a Matrox G400 AGP card in to see if I can avoid this kind of corruption by having a totally different engine on the 1st head. However, there is nothing on the console now that the machine has crashed. AFAIK there were no events such as DPMS wakeup etc to trigger the crash. Now I'm going to see if a) I have or have to go buy a null modem cable, and b) whether kgdb can give me anything useful. |
From: Hugo V. <hvw...@ya...> - 2005-08-23 11:43:39
|
--- Erik Walthinsen <om...@vc...> wrote: > Hugo Vanwoerkom wrote: > > When you say that it crashes with no signs of > > anything, I presume you mean that everything > freezes > > and you cannot move anything? > > That is a kernel panic that most likely shows up > on > > the AGP vc console, the one you boot with. > > > > Has that been active, i.e. visible during a crash? > Not until now, I just ran a test with the 2nd and > 3rd heads running X > and the first head (all PCI) sitting at a text VT. > Problem is, the text > VT got throroughly hozed once the X servers started > up, incapable of > scrolling when e.g. trying to log in, and getting > interleaved in weird > patterns when trying to switch VTs. I may try > putting a Matrox G400 AGP > card in to see if I can avoid this kind of > corruption by having a > totally different engine on the 1st head. > > However, there is nothing on the console now that > the machine has > crashed. AFAIK there were no events such as DPMS > wakeup etc to trigger > the crash. > > Now I'm going to see if a) I have or have to go buy > a null modem cable, > and b) whether kgdb can give me anything useful. > The reason for the textconsole is if the crash is a kernel panic then that is where the last message will show up. I am still not clear on what exactly dies. Is anything still running? When the crash occurs, everything is dead? H ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs |
From: Erik W. <om...@vc...> - 2005-08-23 18:48:58
|
Hugo Vanwoerkom wrote: > The reason for the textconsole is if the crash is a > kernel panic then that is where the last message will > show up. Of course. But if it's corrupted by the other heads, I'm not sure whether or not a panic message would even have showed up anyway. OTOH kgdb should give me even better results, if/when it happens again. > I am still not clear on what exactly dies. Is anything > still running? > When the crash occurs, everything is dead? It varies. The most extreme crash is where everything just locks exactly where it is, with some heads sleeping and others with a mouse on its way to some menu somewhere on the screen, whatever is going on. In these cases the machine ceases to respond to pings, and I have to hard-reset the box. Other cases involve a single head going black, and even shutting down the CRTC completely. Yesterday I was actively using one of the heads and it blanked out from underneath me, then the LCD switched to displaying "No Cable Attached"... Thing is, the rest of the machine was working properly, to the point of the other heads going to sleep and waking up as expected, before I rebooted the machine later. Sometimes I get one or more of the heads blanking a la DPMS, the never waking up, while the machine is otherwise working just fine. Periodically in this situation attempting to wake a display by moving the mouse or hitting a key will trigger a *complete* crash a la the first paragraph above. And of course once I put a kgdb kernel on the machine, and switched to xserver-xorg-dbg-6.8.2-1ubuntu, it ran for many hours operating just fine. Classic Heisenbug. I'll be hammering on it continuously for the remainder of the next two weeks, and if running a debug X server "solves" the problem, so be it. I'll have more time to *really* solve the problem once I get back. |
From: Hugo V. <hvw...@ya...> - 2005-08-23 19:31:09
|
--- Erik Walthinsen <om...@vc...> wrote: > Hugo Vanwoerkom wrote: > > The reason for the textconsole is if the crash is > a > > kernel panic then that is where the last message > will > > show up. > Of course. But if it's corrupted by the other > heads, I'm not sure > whether or not a panic message would even have > showed up anyway. OTOH > kgdb should give me even better results, if/when it > happens again. > > > I am still not clear on what exactly dies. Is > anything > > still running? > > When the crash occurs, everything is dead? > It varies. <snip> You probably have already said this, but what patch version of Ruby are you running? H __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
From: Erik W. <om...@vc...> - 2005-08-23 19:37:10
|
Hugo Vanwoerkom wrote: > You probably have already said this, but what patch > version of Ruby are you running? None. It's not necessary for this setup AFAICT, esp since it works properly except for this crash. Ruby seems to be far too much complexity for an X-only multi-seat system. Also, the 'multiseat' package that's in Ubuntu that will theoretically set up this kind of system (but is not even alpha-level yet, I'll be working on dramatic enhancements to it once I have the second round of hardware), doesn't use Ruby at all. |
From: Helge H. <hel...@ai...> - 2005-08-24 09:23:45
|
Erik Walthinsen wrote: >Hugo Vanwoerkom wrote: > > >>The reason for the textconsole is if the crash is a >>kernel panic then that is where the last message will >>show up. >> >> >Of course. But if it's corrupted by the other heads, I'm not sure >whether or not a panic message would even have showed up anyway. OTOH >kgdb should give me even better results, if/when it happens again. > > > >>I am still not clear on what exactly dies. Is anything >>still running? >>When the crash occurs, everything is dead? >> >> >It varies. The most extreme crash is where everything just locks >exactly where it is, with some heads sleeping and others with a mouse on >its way to some menu somewhere on the screen, whatever is going on. In >these cases the machine ceases to respond to pings, and I have to >hard-reset the box. > >Other cases involve a single head going black, and even shutting down >the CRTC completely. Yesterday I was actively using one of the heads >and it blanked out from underneath me, then the LCD switched to >displaying "No Cable Attached"... Thing is, the rest of the machine was >working properly, to the point of the other heads going to sleep and >waking up as expected, before I rebooted the machine later. > > This is probably the classic problem where an xserver mistakenly thinks it has the "only" vga-compatible card in the box, and tries reprogramming video timings using legacy vga hardware addresses that might affect the wrong card. A workaround: make sure you set up all your xservers with at least two different resolutions. The user who got a blank display can then tap ctrl+alt and "+" or "-" on the numeric keypad, and that xserver will reset its own resolution. (If the other resolution isn't wanted, just press the key combination again to get back.) The resolution is reset, and the display restored. Another workaround: Use only graphichs cards that aren't backwards compatible with VGA - at all. If you can find such cards, that is. Third workaround: Configure framebuffers in your kernel, get a framebuffer for each and every screen. Set sutiable resolutions using kernel boot parameters for the framebuffer drivers. Use the framebuffer xserver. This one don't try to set the resolution, it uses whatever resolution the framebuffer is set to. So problems don't ever happen. Of course this driver is unaccelerated, so your cpu(s) are going to do all the rendering work. Performance may still be fine for simple 2D work like word processing and web browsing and simple 2D games. (I didn't notice any X performance difference between accelerated and unaccelerated 2D X with a 400MHz celeron cpu and two screens.) The correct fix: complain to x.org developers. Perhaps a fix will materialize someday. It certainly won't happen if nobody complains. Helge Hafting |
From: Erik W. <om...@vc...> - 2005-08-24 17:09:50
|
Helge Hafting wrote: > This is probably the classic problem where an xserver mistakenly > thinks it has the "only" vga-compatible card in the box, and > tries reprogramming video timings using legacy vga hardware > addresses that might affect the wrong card. Isn't this what the "VGAAccess" "false" option is for? > Third workaround: Configure framebuffers in your kernel, > get a framebuffer for each and every screen. This is something I tried for a while, but didn't have much luck with. I'll give it another shot though. Accelleration isn't a huge deal, and 3D is out with multiple cards anyway, so speed should be acceptible. What I'd really like to be able to do is get both heads of the dual-headed cards running separate framebuffers, and get separate X servers running on those. That would let me get away with only 3 video cards for 6 heads, freeing other PCI slots for either other cards, or smaller motherboards. Problem is, I couldn't figure out if it is even possible to have separate /dev/fb devices with the kernel fb driver... Everything I tried was either 5+ years old and didn't exist anymore, or didn't work in the slightest otherwise. |
From: Erik W. <om...@vc...> - 2005-08-24 21:10:18
|
Erik Walthinsen wrote: >>Third workaround: Configure framebuffers in your kernel, >>get a framebuffer for each and every screen. > > This is something I tried for a while, but didn't have much luck with. > I'll give it another shot though. No go. When loading the radeonfb module, I get /dev/fb0 for the first card, but no other cards load up. They all come back with "0k" VRAM and refuse to create an FB device. There are hacks in the driver for a couple of video cards to deal with this situation, but not for the same reason. |
From: Helge H. <hel...@ai...> - 2005-08-26 08:10:50
|
Erik Walthinsen wrote: >Erik Walthinsen wrote: > > >>>Third workaround: Configure framebuffers in your kernel, >>>get a framebuffer for each and every screen. >>> >>> >>This is something I tried for a while, but didn't have much luck with. >>I'll give it another shot though. >> >> > >No go. When loading the radeonfb module, I get /dev/fb0 for the first >card, but no other cards load up. They all come back with "0k" VRAM and >refuse to create an FB device. There are hacks in the driver for a >couple of video cards to deal with this situation, but not for the same >reason. > > According to Documentation/fb/framebuffer.tct, you are supposed to get a fb for each _card_ at least. There are further difficulties if you're trying for a fb per head on a multihead card, but one fb per card is supposed to work. The device-nodes /dev/fb1, /dev/fb2, and so on exists? What happens if you compile the framebuffer driver into the kernel, instead of using a module? Or if you tries to load that module a second time? Also make sure vesafb is disabled (or the module _not_ loaded), for vesafb just gets in the way and supports only one card. Wait... You say they come up with 0k VRAM and refuse to load. That is typical for the case where the cards were found, but the bios haven't initialized them. If all your cards are the same type, check to see if the bios have an option for initializing more than one cards. It likely won't, though. There is a way around this, but it is long. Basically: 1. boot the machine. Make sure _no_ framebuffer modules load at this time. 2. Run X briefly. This special X should be configured to simply use all the cards, and initialize them via the int10 routine. This way, every card will get their bios initialization. Make sure this Xserver quits after initialization. For example, it could use /bin/false as "window manager". 3. _Now_ load the framebuffer driver. It should find that all the cards are initialized and have enough RAM and so on. You should get a framebuffer (but not a console) for each screen. You may verify this by cat somefile > /dev/fb1 cat somefile > /dev/fb2 ... and see how garbage appear on each screen as "somefile" contents is written to video memory. 4. Now start your regular xservers, those that use the framebuffers for managing resolution. Helge Hafting |
From: Erik W. <om...@vc...> - 2005-08-28 23:53:42
|
Helge Hafting wrote: > There are further difficulties > if you're trying for a fb per head on a multihead card, but one > fb per card is supposed to work. AFAIK only the Matrox fb driver currently actually does multiple fb devices. Though Radeon cards have had multiple CRTCs for years now... > The device-nodes /dev/fb1, /dev/fb2, and so on exists? Via udev, yes, as needed. > What happens if you compile the framebuffer driver into the kernel, > instead of using a module? Or if you tries to load that module > a second time? I'll try compiling it in, but I don't think I can load the module multiple times. > It likely won't, though. There is a way around this, but it is long. This is basically what I'm doing to initialize the cards currently, with a script that starts up and kills each X server in sequence, with SingleCard "false". Otherwise the SingleCard "true" server for any but the "primary" card will lock the machine at startup. Is there any code out there to execute the int10 VGA BIOS without having to start up a whole X server? |
From: Helge H. <hel...@ai...> - 2005-08-29 07:03:31
|
Erik Walthinsen wrote: >Helge Hafting wrote: > > >>There are further difficulties >>if you're trying for a fb per head on a multihead card, but one >>fb per card is supposed to work. >> >> >AFAIK only the Matrox fb driver currently actually does multiple fb >devices. Though Radeon cards have had multiple CRTCs for years now... > > > >>The device-nodes /dev/fb1, /dev/fb2, and so on exists? >> >> >Via udev, yes, as needed. > > > >>What happens if you compile the framebuffer driver into the kernel, >>instead of using a module? Or if you tries to load that module >>a second time? >> >> >I'll try compiling it in, but I don't think I can load the module >multiple times. > > > >>It likely won't, though. There is a way around this, but it is long. >> >> >This is basically what I'm doing to initialize the cards currently, with >a script that starts up and kills each X server in sequence, with >SingleCard "false". Otherwise the SingleCard "true" server for any but >the "primary" card will lock the machine at startup. > > > I think you may cut down on the time for this by starting (and killing) one xserver that uses all the screens at once. It will then initialize all those adapters. This should be a safe approach, as xserver support for using many cards (for a single user) is very old. You'll need an extra xorg.conf file of course. Anyway, loading the framebuffer driver _after_ this initialization completes, is supposed to work. Make sure the driver isn't present before though, or it will give up too early on those uninitialized cards. >Is there any code out there to execute the int10 VGA BIOS without having >to start up a whole X server? > > Not that I know of. One approach could be to extract this code from the X codebase, and throw away the rest of X. Helge Hafting |
From: Erik W. <om...@vc...> - 2005-08-29 20:54:27
|
Helge Hafting wrote: > I think you may cut down on the time for this by starting (and killing) > one xserver that uses all the screens at once. It will then initialize > all those > adapters. This should be a safe approach, as xserver support for using > many cards (for a single user) is very old. You'll need an extra xorg.conf > file of course. Thanks, that works nicely. I'll have to do some major changes to my scripts if I'm going to make it semi-automagic, but for now hardcoding works. It probably cut the time in half, the remaining time being VGA BIOS initialization. > Anyway, loading the framebuffer driver _after_ this initialization > completes, is supposed to work. Make sure the driver isn't > present before though, or it will give up too early on those > uninitialized cards. Now I'm having problems getting the fbdev driver to work. Even with only one card in the machine and a config file generated by `dpkg-reconfigure xserver-xorg` for fbdev, it fails with: (EE) FBDEV(0): FBIOPUT_VSCREENINFO: Invalid argument (EE) FBDEV(0): mode initialization failed I'm getting 2.6.11 built, so I can build a backstreet-ruby version as suggested to maybe deal with bus issues. I'll also try the just-released 2.6.13 as /. made mention of significant PCI changes, which if I'm really hitting a bus contention issue (whcih I've suspected from the beginning but have not the slightest chance of confirming) might make a difference one way or another... |
From: Helge H. <hel...@ai...> - 2005-08-25 07:19:58
|
Erik Walthinsen wrote: >Helge Hafting wrote: > > >>This is probably the classic problem where an xserver mistakenly >>thinks it has the "only" vga-compatible card in the box, and >>tries reprogramming video timings using legacy vga hardware >>addresses that might affect the wrong card. >> >> >Isn't this what the "VGAAccess" "false" option is for? > > I don't know, but I'll look at it someday. I too have this problem, restarting X on my secondary card messes up the display on the first, which can be solved by tapping ctrl, alt, and the big plus on the keypad. Works for me, but my users get so confused. Maybe vgaaccess helps. Maybe using the framebuffer option will help, according to the docs I'll still get acceleration, but X will use the framebuffer drivers to do resolution changes. That _should_ remove trouble upon x restart, when X tries to set the resolution. > > >>Third workaround: Configure framebuffers in your kernel, >>get a framebuffer for each and every screen. >> >> >This is something I tried for a while, but didn't have much luck with. >I'll give it another shot though. Accelleration isn't a huge deal, and >3D is out with multiple cards anyway, so speed should be acceptible. > > Actually, 3D isn't out with multiple cards. I can run 3D both on my matrox G550 and the radeon 9200 SE at the same time. It is not _stable_, the kernel will occationally trip up freezing one or both displays. So I don't use it. But this is merely a bug in the radeon driver. The kernel, the DRI/DRM 3D systems really support multihead 3D these days. My problems is not a result of dual-seat, running 3D on the radeon alone can go wrong too. >What I'd really like to be able to do is get both heads of the >dual-headed cards running separate framebuffers, and get separate X >servers running on those. That would let me get away with only 3 video >cards for 6 heads, freeing other PCI slots for either other cards, or >smaller motherboards. Problem is, I couldn't figure out if it is even >possible to have separate /dev/fb devices with the kernel fb driver... > > It is possible, but only if the framebuffer driver for that particular card supports it. The driver for matrox G550 (and G400) supports this. Most other drivers doesn't, so this really limits your options when buying video cards. I've done it with the G550 though, when the radeon's old predecessor died. I then got the radeon, and got disappointed with its 3D instability. So I use it without 3D, mostly because the G550 seems to dislike 24-bit color which I like. :-/ Matrox also have some cards that really are "two/four cards in one", in that the pci subsystem believes there are two cards instead of one. So they work exactly as two (or four!) cards when setting up the software. (Kernel framebuffer drivers & xservers.) But they don't use up that many slots. Very expensive though. Of course a "double" card can justify double price, but they're worse. >Everything I tried was either 5+ years old and didn't exist anymore, or >didn't work in the slightest otherwise. > > Well, matrox works, although old. And I believe you still can order the cards too. "old" shouldn't be a problem if you consider unaccelerated 2D-only ok though. (Actually, software 3D rendering works fine with such setups. Way too slow for a first-person 3D game of course, but ok for some other programs like frozen-bubble which isn't really 3D but needs opengl precence anyway. Helge Hafting |
From: Erik W. <om...@vc...> - 2005-08-25 17:04:23
|
Helge Hafting wrote: > I don't know, but I'll look at it someday. I too have this problem, > restarting X on my secondary card messes up the display > on the first, which can be solved by tapping ctrl, alt, and the > big plus on the keypad. Works for me, but my users get so confused. I don't have any problems with starting up, I've managed to get around the various problems there, partially by using the hack of starting each server *without* SingleCard one at a time first, to initialize the VGA BIOS cleanly. The problem still remains that once the system is up, it will crash periodically, with no obvious trigger event. It's crashing so hard that KGDB doesn't even work. I'm out for the weekend, so I have Monday through Friday of next week to try to resolve this, or I'm going to have to go down there and cobble together as many single-headed systems as I can manage from the few scrap bits of hardware they had in the pictures I saw. Not a good solution. ;-( |
From: Fredrik T. <fr...@do...> - 2005-08-28 13:32:41
|
On Fri, 2005-08-19 at 00:53 -0700, Erik Walthinsen wrote: > I've got a system I'm building for a school in a small isolated Andean > town in Bolivia, and I'm leaving on Sept 4th to go install it. Problem > is, it's currently crashing ;-( [...] > The problem is that the machine will then seize up with no warning > anywhere from ~5 to ~30 minutes after starting up GDM. This happens > whether I am using one of the heads or not, and whether I've even > *touched* the system or not. It just as regularly happens when I've > just gone through the entire Ubuntu package list in Synaptic and am > about to start downloading (sigh!) as when I'm not even in the same room > and have started GDM remotely. With 6 video cards, have you considered the possibility that it might simply be a problem with heat dissipation? -- Fredrik Tolf |
From: Erik W. <om...@vc...> - 2005-08-28 23:50:28
|
Fredrik Tolf wrote: > With 6 video cards, have you considered the possibility that it might > simply be a problem with heat dissipation? Considered it, yes, and confirmed it's not the issue (though I really wish it were, cause the solutions are easy). Neither is power. The cards are all Radeon 7000's with very little heat dissipation and power use, and I've both beefed up the power supply with no change, and have an 80mm fan blowing straight onto them. There is no discernable heat when touching the cards' heatsinks. Besides, it happens with just two cards in the machine as well ;-( |
From: Fredrik T. <fr...@do...> - 2005-08-29 20:31:43
|
On Sun, 2005-08-28 at 16:50 -0700, Erik Walthinsen wrote: > Fredrik Tolf wrote: > > With 6 video cards, have you considered the possibility that it might > > simply be a problem with heat dissipation? > > Considered it, yes, and confirmed it's not the issue (though I really > wish it were, cause the solutions are easy). Neither is power. The > cards are all Radeon 7000's with very little heat dissipation and power > use, and I've both beefed up the power supply with no change, and have > an 80mm fan blowing straight onto them. There is no discernable heat > when touching the cards' heatsinks. > > Besides, it happens with just two cards in the machine as well ;-( Bummer... Even so, however, I get the distinct feeling that this is because of some kind of bus lockup or similar, rather than a kernel panic -- especially since you stated earlier that you hadn't even patched your kernel with ruby. In fact, why not try patching the kernel with ruby and see if that works? It might just be worth a shot... Fredrik Tolf |