From: cga2000 <cg...@op...> - 2007-02-16 23:37:44
|
More than 50% of the time my laptop hangs when booting with ... video=atyfb:1400x1050. I have seen similar occurrences reported here and there with recent 2.6 kernels but nothing very clear as to whether this is a known problem or whether a patch or workaround has already been provided. I have not been able to determine if the hang was random -- booting to another system and then rebooting back into this debian "etch" system appears (???) to fix the problem. Relevant "lspci -vvv" output if it matters: 0000.01.00.0 VGA compatible controller: ATI Technologies Inc Rage Mobility P/M AGP 2x (rev 64) (prog-if 00 [VGA]) Subsystem: Dell: Unknown device 009e Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 66 (2000ns min), Cache Line Size: 0x08 (32 bytes) Interrupt: pin A routed to IRQ 11 Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M] Region 1: I/O ports at 2000 [size=256] Region 2: Memory at fc000000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at <unassigned> [disabled] [size=128K] Capabilities: <available only to root> I would be glad to provide additional info. Thanks, cga |
From: Ville <sy...@sc...> - 2007-02-17 13:19:24
|
On Fri, Feb 16, 2007 at 06:37:31PM -0500, cga2000 wrote: > More than 50% of the time my laptop hangs when booting with > > ... video=atyfb:1400x1050. > > I have seen similar occurrences reported here and there with recent 2.6 > kernels but nothing very clear as to whether this is a known problem or > whether a patch or workaround has already been provided. I was just able to reproduce the problem by switching from drivers/ide to libata. In my case the laptop would hang on every boot. Apparently using libata changed the timing enough to trigger the bug. Try this patch: --- drivers/video/aty/mach64_ct.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) Index: linux-2.6.20/drivers/video/aty/mach64_ct.c =================================================================== --- linux-2.6.20.orig/drivers/video/aty/mach64_ct.c +++ linux-2.6.20/drivers/video/aty/mach64_ct.c @@ -598,7 +598,6 @@ static void aty_resume_pll_ct(const stru struct atyfb_par *par = info->par; if (par->mclk_per != par->xclk_per) { - int i; /* * This disables the sclk, crashes the computer as reported: * aty_st_pll_ct(SPLL_CNTL2, 3, info); @@ -609,12 +608,10 @@ static void aty_resume_pll_ct(const stru aty_st_pll_ct(SCLK_FB_DIV, pll->ct.sclk_fb_div, par); aty_st_pll_ct(SPLL_CNTL2, pll->ct.spll_cntl2, par); /* - * The sclk has been started. However, I believe the first clock - * ticks it generates are not very stable. Hope this primitive loop - * helps for Rage Mobilities that sometimes crash when - * we switch to sclk. (Daniel Mantione, 13-05-2003) + * The sclk has been started. Wait for the PLL to lock. 5 ms + * should be enough according to mach64 programmers guide. */ - for (i=0;i<=0x1ffff;i++); + mdelay(5); } aty_st_pll_ct(PLL_REF_DIV, pll->ct.pll_ref_div, par); -- Ville Syrjälä sy...@sc... http://www.sci.fi/~syrjala/ |
From: cga2000 <cg...@op...> - 2007-02-17 17:14:37
|
On Sat, Feb 17, 2007 at 08:19:12AM EST, Ville Syrjälä wrote: > On Fri, Feb 16, 2007 at 06:37:31PM -0500, cga2000 wrote: > > More than 50% of the time my laptop hangs when booting with > > > > ... video=atyfb:1400x1050. > > > > I have seen similar occurrences reported here and there with recent 2.6 > > kernels but nothing very clear as to whether this is a known problem or > > whether a patch or workaround has already been provided. > > I was just able to reproduce the problem by switching from drivers/ide > to libata. In my case the laptop would hang on every boot. Apparently > using libata changed the timing enough to trigger the bug. > > Try this patch: > > --- > drivers/video/aty/mach64_ct.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > Index: linux-2.6.20/drivers/video/aty/mach64_ct.c > =================================================================== > --- linux-2.6.20.orig/drivers/video/aty/mach64_ct.c > +++ linux-2.6.20/drivers/video/aty/mach64_ct.c > @@ -598,7 +598,6 @@ static void aty_resume_pll_ct(const stru > struct atyfb_par *par = info->par; > > if (par->mclk_per != par->xclk_per) { > - int i; > /* > * This disables the sclk, crashes the computer as reported: > * aty_st_pll_ct(SPLL_CNTL2, 3, info); > @@ -609,12 +608,10 @@ static void aty_resume_pll_ct(const stru > aty_st_pll_ct(SCLK_FB_DIV, pll->ct.sclk_fb_div, par); > aty_st_pll_ct(SPLL_CNTL2, pll->ct.spll_cntl2, par); > /* > - * The sclk has been started. However, I believe the first clock > - * ticks it generates are not very stable. Hope this primitive loop > - * helps for Rage Mobilities that sometimes crash when > - * we switch to sclk. (Daniel Mantione, 13-05-2003) > + * The sclk has been started. Wait for the PLL to lock. 5 ms > + * should be enough according to mach64 programmers guide. > */ > - for (i=0;i<=0x1ffff;i++); > + mdelay(5); > } > > aty_st_pll_ct(PLL_REF_DIV, pll->ct.pll_ref_div, par); Thanks. I was planning on streamlining my new 2.6.18 kernel over the weekend so I should be able to report back to you by Monday. cga. |
From: cga2000 <cg...@op...> - 2007-02-20 07:44:04
|
On Sat, Feb 17, 2007 at 12:14:25PM EST, cga2000 wrote: > On Sat, Feb 17, 2007 at 08:19:12AM EST, Ville Syrjälä wrote: > > On Fri, Feb 16, 2007 at 06:37:31PM -0500, cga2000 wrote: > > > More than 50% of the time my laptop hangs when booting with > > > > > > ... video=atyfb:1400x1050. > > > > > > I have seen similar occurrences reported here and there with recent 2.6 > > > kernels but nothing very clear as to whether this is a known problem or > > > whether a patch or workaround has already been provided. > > > > I was just able to reproduce the problem by switching from drivers/ide > > to libata. In my case the laptop would hang on every boot. Apparently > > using libata changed the timing enough to trigger the bug. > > > > Try this patch: > > > > --- > > drivers/video/aty/mach64_ct.c | 9 +++------ > > 1 file changed, 3 insertions(+), 6 deletions(-) > > > > Index: linux-2.6.20/drivers/video/aty/mach64_ct.c > > =================================================================== > > --- linux-2.6.20.orig/drivers/video/aty/mach64_ct.c > > +++ linux-2.6.20/drivers/video/aty/mach64_ct.c > > @@ -598,7 +598,6 @@ static void aty_resume_pll_ct(const stru > > struct atyfb_par *par = info->par; > > > > if (par->mclk_per != par->xclk_per) { > > - int i; > > /* > > * This disables the sclk, crashes the computer as reported: > > * aty_st_pll_ct(SPLL_CNTL2, 3, info); > > @@ -609,12 +608,10 @@ static void aty_resume_pll_ct(const stru > > aty_st_pll_ct(SCLK_FB_DIV, pll->ct.sclk_fb_div, par); > > aty_st_pll_ct(SPLL_CNTL2, pll->ct.spll_cntl2, par); > > /* > > - * The sclk has been started. However, I believe the first clock > > - * ticks it generates are not very stable. Hope this primitive loop > > - * helps for Rage Mobilities that sometimes crash when > > - * we switch to sclk. (Daniel Mantione, 13-05-2003) > > + * The sclk has been started. Wait for the PLL to lock. 5 ms > > + * should be enough according to mach64 programmers guide. > > */ > > - for (i=0;i<=0x1ffff;i++); > > + mdelay(5); > > } > > > > aty_st_pll_ct(PLL_REF_DIV, pll->ct.pll_ref_div, par); > > Thanks. > > I was planning on streamlining my new 2.6.18 kernel over the weekend so > I should be able to report back to you by Monday. Well I did succeed in streamlining my kernel+modules .. reducing the size of the /usr/src/ tree from 600 Meg to about 275 .. but then it took me two days and a night to get it to boot. I think the problem was partly something that I inadvertently excluded or compiled as a module in menuconfig .. with so many options that I modified it's easy to hit the space bar one too many times .. and then I think I was mostly getting confused by the "debian way" of compiling and installing new kernels. In any event, it's now 2:30 in the morning .. I have a kernel that takes about 45 minutes to compile instead of four hours and doesn't panic .. but I haven't found the time yet to apply your patch. The worst of it is that I probably won't have another window of opportunity to work on this for another two weeks or so. It also looks like I may have to modify the patch for the 2.6.18 kernel. I wouldn't mind giving 2.6.20 a shot but then some of the complementary debian packages such as initrd-tools might not be in sync with the newer version. Regarding atyfb, I did have the time to notice two rather disturbing things, though. First of all since I was wasting a lot of time rebooting into my sarge system or my fully-fledged, non-streamlined custom 2.6 kernel to bypass the hang that the patch will address (*) .. I eventually figured I should try to change the video=atyfb:1400x1050 to a vga= specification on the fly by editing the grub "kernel" statement and much to my suprise and though I got both the atyfb and then the vesafb messages on the console it appears that atyfb "won" so to speak: although I had specified vga=791 I was clearly switched to the 1400x1050 mode (rather than 1024x768). Then, since the "debian way" of installing new kernels automatically updates your /boot/grub/menu.lst .. you have to remember to go and edit it and add the "video=atyfb:1400x1050" manually every time you install a new kernel. Now the weird thing is that at one point I accidentally added the "atyfb" bit to the "title" statement instead of the "kernel" statement. And yet atyfb guessed my "intentions" and automagically switched me to 1400x1050. So basically in both cases I was either telling the kernel/atyfb two different things or not telling it anything at all and yet it appears to have detected the native resolution of my display and enabled the 1400x1050 mode automatically. Am I correct in assuming the above .. I mean .. is this how it works .. Is this new with the current kernels & new versions of atyfb? Thanks, cga (*) I boot a 2.6.18 kernel and reproduce the hang .. then say, I boot it again .. I hang again as you would expect .. So, naturally I would then boot the 2.4.27 kernel .. and then reboot the 2.6.18 kernel .. and this time atyfb smoothly switches to the fb console as if nothing had happened. In my case (fortunately ..) it seems to be systematic. No idea whether this may be of interest or not. |
From: Ville <sy...@sc...> - 2007-02-25 11:19:06
|
On Tue, Feb 20, 2007 at 02:43:53AM -0500, cga2000 wrote: > > Regarding atyfb, I did have the time to notice two rather disturbing > things, though. > > First of all since I was wasting a lot of time rebooting into my sarge > system or my fully-fledged, non-streamlined custom 2.6 kernel to bypass > the hang that the patch will address (*) .. I eventually figured I > should try to change the video=atyfb:1400x1050 to a vga= specification > on the fly by editing the grub "kernel" statement and much to my suprise > and though I got both the atyfb and then the vesafb messages on the > console it appears that atyfb "won" so to speak: although I had > specified vga=791 I was clearly switched to the 1400x1050 mode (rather > than 1024x768). I always disable vesafb from my kernels. I don't trust it to play nice with hw specific fbdev drivers. I think if you want to force vesafb you need pass video=vesafb to the kernel. > Then, since the "debian way" of installing new kernels automatically > updates your /boot/grub/menu.lst .. you have to remember to go and edit > it and add the "video=atyfb:1400x1050" manually every time you install a > new kernel. Now the weird thing is that at one point I accidentally > added the "atyfb" bit to the "title" statement instead of the "kernel" > statement. And yet atyfb guessed my "intentions" and automagically > switched me to 1400x1050. > > So basically in both cases I was either telling the kernel/atyfb two > different things or not telling it anything at all and yet it appears to > have detected the native resolution of my display and enabled the > 1400x1050 mode automatically. > > Am I correct in assuming the above .. I mean .. is this how it works .. > Is this new with the current kernels & new versions of atyfb? Yes, atyfb will automatically use the panel's native resolution. -- Ville Syrjälä sy...@sc... http://www.sci.fi/~syrjala/ |
From: cga2000 <cg...@op...> - 2007-02-24 19:36:15
|
On Sat, Feb 17, 2007 at 08:19:12AM EST, Ville Syrjälä wrote: > On Fri, Feb 16, 2007 at 06:37:31PM -0500, cga2000 wrote: > > More than 50% of the time my laptop hangs when booting with > > > > ... video=atyfb:1400x1050. > > > > I have seen similar occurrences reported here and there with recent 2.6 > > kernels but nothing very clear as to whether this is a known problem or > > whether a patch or workaround has already been provided. > > I was just able to reproduce the problem by switching from drivers/ide > to libata. In my case the laptop would hang on every boot. Apparently > using libata changed the timing enough to trigger the bug. > > Try this patch: > > --- > drivers/video/aty/mach64_ct.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > Index: linux-2.6.20/drivers/video/aty/mach64_ct.c > =================================================================== > --- linux-2.6.20.orig/drivers/video/aty/mach64_ct.c > +++ linux-2.6.20/drivers/video/aty/mach64_ct.c > @@ -598,7 +598,6 @@ static void aty_resume_pll_ct(const stru > struct atyfb_par *par = info->par; > > if (par->mclk_per != par->xclk_per) { > - int i; > /* > * This disables the sclk, crashes the computer as reported: > * aty_st_pll_ct(SPLL_CNTL2, 3, info); > @@ -609,12 +608,10 @@ static void aty_resume_pll_ct(const stru > aty_st_pll_ct(SCLK_FB_DIV, pll->ct.sclk_fb_div, par); > aty_st_pll_ct(SPLL_CNTL2, pll->ct.spll_cntl2, par); > /* > - * The sclk has been started. However, I believe the first clock > - * ticks it generates are not very stable. Hope this primitive loop > - * helps for Rage Mobilities that sometimes crash when > - * we switch to sclk. (Daniel Mantione, 13-05-2003) > + * The sclk has been started. Wait for the PLL to lock. 5 ms > + * should be enough according to mach64 programmers guide. > */ > - for (i=0;i<=0x1ffff;i++); > + mdelay(5); > } > > aty_st_pll_ct(PLL_REF_DIV, pll->ct.pll_ref_div, par); I managed to make some progress on this issue. 1. I tried to apply the patch to the 2.6.18 kernel and but it failed. 2. I took a look at the source and it looks like there was at least one intervening patch between 2.6.18 and 2.6.20. The .. par->mclk_per != .. test appears to have been reversed: '.. mclk_per == par->xclk_per) .. instead of '.. != par->xclk_per' ^ ^ 3. I downloaded kernels 2.6.20 and 2.6.20.1 but patching failed on both hunks. 4. On the other hand, with these new versions, I was able to locate the code in mach64_ct.c and manually made the changes. 5. With your changes applied I rebooted at least a half a dozen times and I never again experienced the hang. 6. I obviously have not run the modified code for any length of time but as far as I can tell the changes do not seem to cause any adverse side-effects. So, as far as I am concerned, your patch fixes the problem. Now, since I have run into unrelated problems with the newer 2.6.20 kernels I would much rather stick with 2.6.18 for now. Are you aware of any intervening patches to mach64_ct.c that I could apply so as to bring the 2.6.18 code up to the more current level and then make your recommended changes, or is there more to it than just patching this particular program? I will take a look at the change logs but since I am not a professional kernel maintainainer by a long way, I would greatly appreciate if you could advise on the best course of action. Thank you very much indeed for your help. cga |
From: Ville <sy...@sc...> - 2007-02-25 11:11:33
|
On Sat, Feb 24, 2007 at 02:35:57PM -0500, cga2000 wrote: > On Sat, Feb 17, 2007 at 08:19:12AM EST, Ville Syrjälä wrote: > > On Fri, Feb 16, 2007 at 06:37:31PM -0500, cga2000 wrote: > > > More than 50% of the time my laptop hangs when booting with > > > > > > ... video=atyfb:1400x1050. > > > > > > I have seen similar occurrences reported here and there with recent 2.6 > > > kernels but nothing very clear as to whether this is a known problem or > > > whether a patch or workaround has already been provided. > > > > I was just able to reproduce the problem by switching from drivers/ide > > to libata. In my case the laptop would hang on every boot. Apparently > > using libata changed the timing enough to trigger the bug. > > > > Try this patch: > > > > --- > > drivers/video/aty/mach64_ct.c | 9 +++------ > > 1 file changed, 3 insertions(+), 6 deletions(-) > > > > Index: linux-2.6.20/drivers/video/aty/mach64_ct.c > > =================================================================== > > --- linux-2.6.20.orig/drivers/video/aty/mach64_ct.c > > +++ linux-2.6.20/drivers/video/aty/mach64_ct.c > > @@ -598,7 +598,6 @@ static void aty_resume_pll_ct(const stru > > struct atyfb_par *par = info->par; > > > > if (par->mclk_per != par->xclk_per) { > > - int i; > > /* > > * This disables the sclk, crashes the computer as reported: > > * aty_st_pll_ct(SPLL_CNTL2, 3, info); > > @@ -609,12 +608,10 @@ static void aty_resume_pll_ct(const stru > > aty_st_pll_ct(SCLK_FB_DIV, pll->ct.sclk_fb_div, par); > > aty_st_pll_ct(SPLL_CNTL2, pll->ct.spll_cntl2, par); > > /* > > - * The sclk has been started. However, I believe the first clock > > - * ticks it generates are not very stable. Hope this primitive loop > > - * helps for Rage Mobilities that sometimes crash when > > - * we switch to sclk. (Daniel Mantione, 13-05-2003) > > + * The sclk has been started. Wait for the PLL to lock. 5 ms > > + * should be enough according to mach64 programmers guide. > > */ > > - for (i=0;i<=0x1ffff;i++); > > + mdelay(5); > > } > > > > aty_st_pll_ct(PLL_REF_DIV, pll->ct.pll_ref_div, par); > > I managed to make some progress on this issue. > > 1. I tried to apply the patch to the 2.6.18 kernel and but it failed. > > 2. I took a look at the source and it looks like there was at least one > intervening patch between 2.6.18 and 2.6.20. The .. par->mclk_per != > .. test appears to have been reversed: > > '.. mclk_per == par->xclk_per) .. instead of '.. != par->xclk_per' > ^ ^ The test has been reversed because in 2.6.18 the busy loop is in the 'else' branch and in 2.6.20 it's in the 'if' branch. > 3. I downloaded kernels 2.6.20 and 2.6.20.1 but patching failed on both > hunks. IIRC I made the patch against 2.6.20 so it shouldn't have failed. Maybe your mail program corrupted the patch. > 4. On the other hand, with these new versions, I was able to locate the > code in mach64_ct.c and manually made the changes. > > 5. With your changes applied I rebooted at least a half a dozen times > and I never again experienced the hang. > > 6. I obviously have not run the modified code for any length of time but > as far as I can tell the changes do not seem to cause any adverse > side-effects. There should be no side effects as this code is only executed when the driver is initialized. > So, as far as I am concerned, your patch fixes the problem. Good. So that makes it three verified cases of fixing the bug. > Now, since I have run into unrelated problems with the newer 2.6.20 > kernels I would much rather stick with 2.6.18 for now. > > Are you aware of any intervening patches to mach64_ct.c that I could > apply so as to bring the 2.6.18 code up to the more current level and > then make your recommended changes, or is there more to it than just > patching this particular program? The changes to mach64_ct.c were due to my patch that fixed resume from suspend-to-ram. There were also other patches applied at the same time but none were critical in any sense. You can stick with 2.6.18 + the mdelay() fix. -- Ville Syrjälä sy...@sc... http://www.sci.fi/~syrjala/ |
From: cga2000 <cg...@op...> - 2007-02-26 21:23:45
|
On Sun, Feb 25, 2007 at 06:11:21AM EST, Ville Syrjälä wrote: > On Sat, Feb 24, 2007 at 02:35:57PM -0500, cga2000 wrote: > > On Sat, Feb 17, 2007 at 08:19:12AM EST, Ville Syrjälä wrote: [..] > > > > 1. I tried to apply the patch to the 2.6.18 kernel and but it failed. > > > > 2. I took a look at the source and it looks like there was at least one > > intervening patch between 2.6.18 and 2.6.20. The .. par->mclk_per != > > .. test appears to have been reversed: > > > > '.. mclk_per == par->xclk_per) .. instead of '.. != par->xclk_per' > > ^ ^ > > The test has been reversed because in 2.6.18 the busy loop is in the > 'else' branch and in 2.6.20 it's in the 'if' branch. This is why I assumed that there was an intervening patch (or patches) between 2.6.18 and 2.6.20. So I was not too happy about making the change to the 2.6.18 source thinking that the changes brought about by the intervening patch(es) might be needed for your new patch to work correctly. > > 3. I downloaded kernels 2.6.20 and 2.6.20.1 but patching failed on both > > hunks. > > IIRC I made the patch against 2.6.20 so it shouldn't have failed. Maybe > your mail program corrupted the patch. Oh, that it did .. but it was not so much mutt that broke the patch .. more of a problem with our respective email encodings. But what I guessed regarding what "patch" was complaining about .. it looked more like a problem with the line numbers -- the code is there all right but the line numbers don't match. But that's OK, since I have little HD space to spare and I don't plan on maintaining a patched source tree alongside with the original. > > 4. On the other hand, with these new versions, I was able to locate the > > code in mach64_ct.c and manually made the changes. > > > > 5. With your changes applied I rebooted at least a half a dozen times > > and I never again experienced the hang. > > > > 6. I obviously have not run the modified code for any length of time but > > as far as I can tell the changes do not seem to cause any adverse > > side-effects. > > There should be no side effects as this code is only executed when the > driver is initialized. A convincing argument. I kinda suspected that, but then for all I knew, it might set up things differently .. in ways that might resurface at a later point in time. > > So, as far as I am concerned, your patch fixes the problem. > > Good. So that makes it three verified cases of fixing the bug. Glad I could help. Being able to run the framebuffer console at my display's native resolution made a world of difference and giving your patch a go was the least I could do. :-) > > Now, since I have run into unrelated problems with the newer 2.6.20 > > kernels I would much rather stick with 2.6.18 for now. > > > > Are you aware of any intervening patches to mach64_ct.c that I could > > apply so as to bring the 2.6.18 code up to the more current level and > > then make your recommended changes, or is there more to it than just > > patching this particular program? > > The changes to mach64_ct.c were due to my patch that fixed resume from > suspend-to-ram. There were also other patches applied at the same time > but none were critical in any sense. You can stick with 2.6.18 + the > mdelay() fix. 2.6.20 has some major problems with my pcmcia CD burner and I really do not have the time to report/research them at this point. And w/o the CD burner I cannot back up my system. So this is really excellent news. Thank you very much for your help. Thanks, cga |