|
From: Rene B. <re...@we...> - 2001-09-26 18:52:45
|
Hi!
I am working hard on the 53c770 driver and there is a cache problem on=20
APUS/PPC. Is here anybody who can tell me how I can allocate some kernel=20
mem which is not cached (for DMA or to communicate with the 53c770)?
I`m using something like this:
np =3D __get_free_pages(GFP_ATOMIC, 1);
cache_push((u_long)virt_to_phys(np), 8192);
cache_clear((u_long)virt_to_phys(np), 8192);
kernel_set_cachemode((np), 8192, IOMAP_NOCACHE_SER);
Then the driver starts a test where the 53c770 is exchanging some 53c770=20
register values with some values in np.
The test fails, because np is cached!
If I explicitly call: flush_dcache_range(np, np + sizeof(np)); before=20
and after this test, the test doesn`t fail.
I also have tried other methods to get some mem which is not cached,=20
i.e. pci_alloc_consistent() (which is based on __get_free_pages()) and=20
manipulating TLB`s as described in Documentation/cachetlb.txt. Without=20
success. All these methods does not help to prevent caching of the=20
allocated region.
Ciao, Ren=E8
|
|
From: Alan B. <al...@ms...> - 2001-09-27 11:14:48
|
hi, > np = __get_free_pages(GFP_ATOMIC, 1); why the 1 instead of 0 ? > cache_push((u_long)virt_to_phys(np), 8192); > cache_clear((u_long)virt_to_phys(np), 8192); > kernel_set_cachemode((np), 8192, IOMAP_NOCACHE_SER); how about a bzero(np,sizeof(*np)); or, before the kernel_set_cachemode call, place cache_push() and cache_clear() commands? good luck! Alan |
|
From: Rene B. <re...@we...> - 2001-09-27 19:56:39
|
Alan Buxey wrote: >> np =3D __get_free_pages(GFP_ATOMIC, 1); > > > why the 1 instead of 0 ? To get 2 pages. Sure, this is not realy necessary, but the allocated=20 size should not be cause of this cache problem. >> cache_push((u_long)virt_to_phys(np), 8192); >> cache_clear((u_long)virt_to_phys(np), 8192); >> kernel_set_cachemode((np), 8192, IOMAP_NOCACHE_SER); > > > how about a bzero(np,sizeof(*np)); > > or, before the kernel_set_cachemode call, place > cache_push() and cache_clear() commands? I have allready tried this, no success. I also have played with some=20 other cache invalidation code. That all doesn`t help. Also I have=20 modified kernel_set_cachemode() (i.e manipulating TLB`s and=20 cache-flushing), no success. By the way I think that kernel_set_cachemode() does not working right,=20 because the while() loop that set the cache modes for a page is only=20 called once, but I have allocated 2 pages. Maybe some other drivers have=20 trouble with that. > > good luck! Thanks! Ciao, Ren=E8 |
|
From: Alan B. <al...@ms...> - 2001-09-27 18:29:23
|
hi, > >> kernel_set_cachemode((np), 8192, IOMAP_NOCACHE_SER); > By the way I think that kernel_set_cachemode() does not working right, > because the while() loop that set the cache modes for a page is only > called once, but I have allocated 2 pages. Maybe some other drivers have > trouble with that. is 8192 just one page, or would you expect it to be 2? how about using a sizeof((*np)) instead of the 8192? alan |
|
From: Rene B. <re...@we...> - 2001-09-28 17:55:16
|
Alan Buxey wrote: > > is 8192 just one page, or would you expect it to be 2? 8192 are 2 pages (pagesize =3D 4096), but __get_free_pages() gives you 1=20 page for 0 and 2 pages for 1 and 3 pages for 2 and so on... > > how about using a sizeof((*np)) instead of the 8192? Doesn`t work properly with kernel_set_cachemode(). I posted yesterday there was a problem in kernel_set_cachemode(). Thats=20 not true. I have made some changes in kernel_set_cachemode(), so that=20 the loop-counter wasn`t calculated correctly anymore. Blame on me! I=20 hope that no one has wasted time to looking whats up in=20 kernel_set_cachemode()... Nevertheless, kernel_set_cachemode() does not set the cachemode for the=20 last page, if the upper boundary of size does not end up at a multiply=20 of PAGE_SIZE (as it will be with sizeof()). Some little changes should=20 fix this. Maybe its not necessary, but I think the code were more=20 reliable. Maybe I fix it later. Ciao, Ren=E8 |
|
From: Ken T. <ke...@we...> - 2001-09-28 20:09:55
|
On Fri, 28 Sep 2001, Rene Brothuhn wrote: > Nevertheless, kernel_set_cachemode() does not set the cachemode for the > last page, if the upper boundary of size does not end up at a multiply > of PAGE_SIZE (as it will be with sizeof()). Some little changes should > fix this. Maybe its not necessary, but I think the code were more > reliable. Maybe I fix it later. Hello, Don't understand what you mean by "last page" or "upper boundary of size". Do you mean if I : page = __get_free_pages(GFP_ATOMIC,1); which allocates two pages, the second of which does not have cache mode set correctly ? Thinking about yet another play with the A4091... Ken. |
|
From: Rene B. <re...@we...> - 2001-09-29 16:57:15
|
Ken Tyler wrote: > > Hello, > > Don't understand what you mean by "last page" or "upper boundary of siz= e". > > Do you mean if I : > > page =3D __get_free_pages(GFP_ATOMIC,1); > > which allocates two pages, the second of which does not have cache mode > set correctly ? Hi, OK, I`l describe it. If you use __get_free_pages(GFP_ATOMIC, 1) you get=20 2 pages and you will probably apply 8192 to the size argument of=20 kernel_set_cachemode(). Thats OK. But if you apply sizeof(struct x) to the size argument and sizeof(struct=20 x) isn`t a multiply of PAGE_SIZE then the last page is not remapped by=20 kernel_set_cachemode(). Because the loop in kernel_set_cachemode() is=20 initialized at value: size =3D size / PAGE_SIZE. If size =3D 10000, so you have to remap 3 pages, but 10000 / 4096 =3D=20 2(u_long) and the loop only remaps 2 pages. I hope now its clear what I mean. > > Thinking about yet another play with the A4091... It`s weekend, have fun... Ciao, Ren=E8 |
|
From: Ken T. <ke...@we...> - 2001-09-30 10:00:00
|
On Sat, 29 Sep 2001, Rene Brothuhn wrote: > OK, I`l describe it. If you use __get_free_pages(GFP_ATOMIC, 1) you get > 2 pages and you will probably apply 8192 to the size argument of > kernel_set_cachemode(). Thats OK. > But if you apply sizeof(struct x) to the size argument and sizeof(struct > x) isn`t a multiply of PAGE_SIZE then the last page is not remapped by > kernel_set_cachemode(). Because the loop in kernel_set_cachemode() is > initialized at value: size = size / PAGE_SIZE. > If size = 10000, so you have to remap 3 pages, but 10000 / 4096 = > 2(u_long) and the loop only remaps 2 pages. > > I hope now its clear what I mean. Yes, clear as mud ;) > > Thinking about yet another play with the A4091... Well, I won't bother then as nothing's affected. Thanks, Ken. |
|
From: Ken T. <ke...@we...> - 2001-09-27 12:09:00
|
On Wed, 26 Sep 2001, Rene Brothuhn wrote:
Hello,
What kernel version are you running ?
Something odd in 2.4.8 caused the 53c7xx.c A4091 driver to hang my system
at boot time but only with a modular kernel, even though 53c7xx.c is not a
module. Monolithic kernels were OK.
And it was hanging somewhere around :
memset((void *)instance->hostdata[0], 0, 8192);
cache_push(virt_to_phys((void *)(instance->hostdata[0])), 8192);
cache_clear(virt_to_phys((void *)(instance->hostdata[0])), 8192);
kernel_set_cachemode(instance->hostdata[0], 8192, IOMAP_NOCACHE_SER);
similar to what you have.
The current 2.4.9 CVS is OK.
(There are problems with the A4091 driver but that's a ZORRO bus glitch)
> Then the driver starts a test where the 53c770 is exchanging some 53c770
> register values with some values in np.
> The test fails, because np is cached!
> If I explicitly call: flush_dcache_range(np, np + sizeof(np)); before
> and after this test, the test doesn`t fail.
Have a look at the tests in 53c7xx.c, they work. I noticed the use of
barrier() in one test.
Ken.
|
|
From: Rene B. <re...@we...> - 2001-09-27 19:03:52
|
Ken Tyler wrote: > > What kernel version are you running ? 2.4.9; checkout date: 12/September > > Something odd in 2.4.8 caused the 53c7xx.c A4091 driver to hang my syst= em > at boot time but only with a modular kernel, even though 53c7xx.c is no= t a > module. Monolithic kernels were OK. Hmm, I`m using a modular kernel, but the 53c770 are not compiled as a=20 module, will playing a little with that. > And it was hanging somewhere around : > > memset((void *)instance->hostdata[0], 0, 8192); > cache_push(virt_to_phys((void *)(instance->hostdata[0])), 8192); > cache_clear(virt_to_phys((void *)(instance->hostdata[0])), 8192); > kernel_set_cachemode(instance->hostdata[0], 8192, IOMAP_NOCACHE_SER); > > similar to what you have. > > The current 2.4.9 CVS is OK. > > (There are problems with the A4091 driver but that's a ZORRO bus glitch= ) >> Then the driver starts a test where the 53c770 is exchanging some 53c7= 70 >> register values with some values in np. >> The test fails, because np is cached! >> If I explicitly call: flush_dcache_range(np, np + sizeof(np)); before >> and after this test, the test doesn`t fail. > > > Have a look at the tests in 53c7xx.c, they work. I noticed the use of > barrier() in one test. I have slightly modified the 53c7xx.c and 53c7xx.h to get it run with=20 the 53c770 chip. Because the differences between these 2 chips are not=20 very large (only some register changes). But I notice that the driver=20 test hangs at barrier()... Is it right that you use APUS with the 53c7xx.c driver on a CSPPC? That means that the machnism I use to get some uncached mem is working=20 on an APUS-604e machine. So whats wrong here? I don`t understand this :-( Ciao, Ren=E8 |
|
From: Ken T. <ke...@we...> - 2001-09-28 10:58:50
|
On Thu, 27 Sep 2001, Rene Brothuhn wrote: > 2.4.9; checkout date: 12/September My last update was on the 17th, may be worth doing an update. > Hmm, I`m using a modular kernel, but the 53c770 are not compiled as a > module, will playing a little with that. See if it's the same with a one-piece kernel. > Is it right that you use APUS with the 53c7xx.c driver on a CSPPC? > That means that the machnism I use to get some uncached mem is working > on an APUS-604e machine. So whats wrong here? I do. It's OK if I don't work it too hard, reading CDROMS and reading/writing ZIPs it will run all day but if I work it harder it on faster drives it eventually hangs anytime between 1 second and 1/2 an hour of use. I spent ages on it and came to the conclusion it is a (known) zorro bus arbitration problem or possibly a flaw in the CSPPC arbitration. > I don`t understand this :-( Don't know much about scripts but are the 770 scripts the same as the 710 scripts ? Ken. |
|
From: schneider <sch...@te...> - 2001-09-28 14:26:27
|
Ken Tyler wrote: > > On Thu, 27 Sep 2001, Rene Brothuhn wrote: > > >>2.4.9; checkout date: 12/September >> > > My last update was on the 17th, may be worth doing an update. > > >>Hmm, I`m using a modular kernel, but the 53c770 are not compiled as a >>module, will playing a little with that. >> > > See if it's the same with a one-piece kernel. > > >>Is it right that you use APUS with the 53c7xx.c driver on a CSPPC? >>That means that the machnism I use to get some uncached mem is working >>on an APUS-604e machine. So whats wrong here? >> > > I do. It's OK if I don't work it too hard, reading CDROMS and > reading/writing ZIPs it will run all day but if I work it harder it on > faster drives it eventually hangs anytime between 1 second and 1/2 an hour > of use. I spent ages on it and came to the conclusion it is a (known) > zorro bus arbitration problem or possibly a flaw in the CSPPC arbitration. > > >>I don`t understand this :-( >> > > Don't know much about scripts but are the 770 scripts the same as the 710 > scripts ? > > Ken. > Hi, i got the cache test running, in my opinien it seems like an alignement problem or hardware bug. I also changed the script_asm.pl to 770, because some registers are different on 710 and 8x0. Did you looked at the BSD driver? Axel |
|
From: Rene B. <re...@we...> - 2001-09-28 17:56:10
|
schneider wrote: > Hi, > > i got the cache test running, in my opinien it seems like an=20 > alignement problem or hardware bug. I also changed the script_asm.pl=20 > to 770, because some registers are different on 710 and 8x0. Did you=20 > looked at the BSD driver? > Are playing with 53c770.c or with 53c7xx.c? As I know, the 53c770.c has=20 all needed scripts enbedded in the source-code, no .scr files are used.=20 Maybe I`m not right. Ciao, Ren=E8 |
|
From: Ken T. <ke...@we...> - 2001-09-28 19:54:17
|
On Fri, 28 Sep 2001, schneider wrote: > Hi, > i got the cache test running, in my opinien it seems like an alignement > problem or hardware bug. I also changed the script_asm.pl to 770, > because some registers are different on 710 and 8x0. Did you looked at > the BSD driver? No, but I crudely hacked the sim710.c driver to work awhile ago and it had the exact same problem. Ken |
|
From: Rene B. <re...@we...> - 2001-09-28 17:55:55
|
Ken Tyler wrote: >> Hmm, I`m using a modular kernel, but the 53c770 are not compiled as a >> module, will playing a little with that. > > > See if it's the same with a one-piece kernel. I now have compiled the kernel monolithic without module support and the=20 cache-problem is gone, wierd stuff. I`m not sure this is related to=20 module support or not, because I have "cleaned" the kernel-config from=20 all stuff that I don`t need. Maybe I figure it out in future. For now I=20 will further working on the 53c770 driver. I have some questions to the 53c770 driver. Who has released it to the=20 APUS kernel? Is it APUS only, so that I can throw away all stuff thats=20 not needed and can make changes without #ifdef`s? > Don't know much about scripts but are the 770 scripts the same as the 7= 10 > scripts ? Scripts are upward compatible. The 770 has a few more Scripts-commands,=20 thats all. Ciao, Ren=E8 |
|
From: Ken T. <ke...@we...> - 2001-09-28 20:01:52
|
On Fri, 28 Sep 2001, Rene Brothuhn wrote: > I now have compiled the kernel monolithic without module support and the > cache-problem is gone, wierd stuff. I`m not sure this is related to > module support or not, because I have "cleaned" the kernel-config from > all stuff that I don`t need. Maybe I figure it out in future. For now I > will further working on the 53c770 driver. I always 'make mrproper' when changing kernel modules to monolithic and VV., and then build from scratch. I did this a couple of times with 2.4.8, the module/scsi problem remained. > I have some questions to the 53c770 driver. Who has released it to the > APUS kernel? Is it APUS only, so that I can throw away all stuff thats > not needed and can make changes without #ifdef`s? That was fh...@at... who was working on getting it running. Don't know what platforms use 770. Ken. |
|
From: Roman Z. <zi...@li...> - 2001-09-28 11:41:09
|
Rene Brothuhn wrote: > np = __get_free_pages(GFP_ATOMIC, 1); > cache_push((u_long)virt_to_phys(np), 8192); > cache_clear((u_long)virt_to_phys(np), 8192); > kernel_set_cachemode((np), 8192, IOMAP_NOCACHE_SER); You need to boot with nobats, otherwise kernel_set_cachemode doesn't work. Anyway, in the long run kernel_set_cachemode shouldn't be used, it's simply isn't portable. The only problem is there isn't any standard interface yet to do this properly. Something like this is needed: vaddr = __vmalloc(size, GFP_KERNEL, _PAGE_NO_CACHE | _PAGE_GUARDED); for every page paddr = va_to_phys(vaddr); va_to_phys exists currently only for ppc, but it's quite generic code. For testing you can try it with nobats and we can change it later, but you should already take care to only work with single pages, that makes the conversion later easier. bye, Roman |
|
From: Rene B. <re...@we...> - 2001-09-28 17:55:56
|
Roman Zippel wrote: > > You need to boot with nobats, otherwise kernel_set_cachemode doesn't > work. Will try it. > > Anyway, in the long run kernel_set_cachemode shouldn't be used, it's > simply isn't portable. The only problem is there isn't any standard > interface yet to do this properly. > Something like this is needed: > vaddr =3D __vmalloc(size, GFP_KERNEL, _PAGE_NO_CACHE | _PAGE_GUARDED); > for every page > paddr =3D va_to_phys(vaddr); > > va_to_phys exists currently only for ppc, but it's quite generic code. > For testing you can try it with nobats and we can change it later, but > you should already take care to only work with single pages, that makes > the conversion later easier. As far as I know the standard interface to get some uncached mem is=20 pci_alloc_consistent() as described in Documentation/DMA-mapping.txt.=20 This seems to working on APUS, but I`m not sure if the TLB`s are marked=20 as cache-inhibit. Ciao, Ren=E8 |
|
From: Roman Z. <zi...@li...> - 2001-09-28 21:53:42
|
Hi, Rene Brothuhn wrote: > As far as I know the standard interface to get some uncached mem is > pci_alloc_consistent() as described in Documentation/DMA-mapping.txt. > This seems to working on APUS, but I`m not sure if the TLB`s are marked > as cache-inhibit. No there aren't. First it's pci specific and it doesn't mention the cache at all. If you look at arch/ppc/kernel/pci-dma.c, you see it's just a __get_free_pages(), but you need a new mapping to get the new cache mode. bye, Roman |
|
From: Rene B. <re...@we...> - 2001-09-29 16:57:11
|
Roman Zippel wrote: > Hi, > > Rene Brothuhn wrote: > >> As far as I know the standard interface to get some uncached mem is >> pci_alloc_consistent() as described in Documentation/DMA-mapping.txt. >> This seems to working on APUS, but I`m not sure if the TLB`s are marke= d >> as cache-inhibit. > > > No there aren't. First it's pci specific and it doesn't mention the > cache at all. If you look at arch/ppc/kernel/pci-dma.c, you see it's > just a __get_free_pages(), but you need a new mapping to get the new > cache mode. Hello, Documentation/DMA-mapping.txt says that pci_alloc_consistent() can also=20 be used on machines that don`t have PCI-bus. Ciao, Ren=E8 |
|
From: Roman Z. <zi...@li...> - 2001-09-29 17:41:32
|
Hi, Rene Brothuhn wrote: > Documentation/DMA-mapping.txt says that can pci_alloc_consistent() > also be used on machines that don`t have PCI-bus. It says the API can also be used for other machines, I'm not that sure about the implementation (although most of it should be generic enough). For pci_alloc_consistent() it says to pass NULL for non-PCI devices and mentions only (E)ISA, what sounds like it's just meant for the normal pc architecture. Anyway, shouldn't the bus need any special consideration, it should work as well, but then I'd prefer a better naming. bye, Roman |