A very refreshing story on multi-seated X! I enjoyed reading it and
learned a lot about stability issues. Keep up the good work, and thank
you for this well-written report!
>
> Message: 2
> Date: Mon, 14 Nov 2005 21:36:46 -0600
> From: Open Sense Solutions <in...@op...>
> To: lin...@li...
> Subject: nvidia crashing info
>
> Here's some more info on the nvidia crashing problem so many people
> have experienced in one form or another:
> Our old Debian software with XFree86 used the nvidia driver and was
> very stable. Now with Ubuntu 5.10 and Xorg we are having a lot more
> problems. Using the nv open source driver instead of the binary
> nvidia, things are very stable. Restarting the primary x server with
> ctrl-alt-bs repeatedly can crash things, but we have gdm set to not
> restart the x servers. However, the nvidia binary driver is crashing
> about 1/20 times during logout. A gdm restart gets things back, but
> the crashing is unacceptable. We have noticed interrupt conflicts
> that made things even worse, but even with the nvidia cards on a
> different interrupt from the ethernet card we still see crashes. It
> would be interesting to see if we could get each nvidia card to have
> its own interrupt if that would fix things. Of course interrupt
> conflicts mean the nvidia driver is not pci spec compliant, but what
> will it take to convince nvidia to take a serious look at this issue?
>
> We just turned off a third station on one of our boxes to see if it
> was more stable, and it still crashed using the nvidia driver, but
> even more interestingly, the gdm.log had this message:
>
> NVIDIA: could not open the device file /dev/nvidia2 (Input/output error).
> (EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device!
>
> and /dev/nvidia2 shouldn't even be used! Why is the primary card
> trying to open /dev/nvidia2??? We were only running 2 xservers which
> should be using /dev/nvidia0 and /dev/nvidia1. /dev/nvidia2 was
> probably created during our initial probe (we probe all cards even if
> we don't have enough keyboards and mice in case we want to start
> additional servers later) but it should not be referenced at all.
>
> This kind of error tells us something about how the cards are
> conflicting with other cards in the system, but only nvidia can make
> sense of it since the drivers are closed. We are fed up, and have 5
> ATI cards getting delivered tomorrow for experimentation: two 9250
> pci, 1 9250 agp, and an x300 pci-e. Supposedly you can run two
> independent x servers with one dual head card using the binary ATI
> driver, which would be a cost savings. The multiXnest approach gets
> you that even with nvidia, but it is slow. Everyone says ATI is not
> linux friendly, but maybe once we figure out all the quirks it will
> not crash like nvidia. If Matrox was a little more Linux friendly we
> might try them, and their new pci-e 1x cards would be very useful in
> modern systems to allow more cards, but Matrox prices are just insane.
|