|
From: NIIBE Y. <gn...@m1...> - 2001-08-05 08:26:18
|
Last year, we've implemented using FPU in kernel. That is, not assuming kernel doesn't use FPU, we enable drivers to use FPU. This was done because GCC at that time, uses FPU for division. In the code, we use init_task as special task, and when kernel uses FPU, we pretend as if "user process init_task" uses FPU. It's a kind of hack. The kernel has been changed since, this doesn't work any more, because of special kernel thread. So far so good, as kernel threads do not do general things. However, now we have ksoftirqd kernel threads, and it handles software interrupts (or bottom halves). This means, ksoftirqd calls routines of drivers, and if we need using FPU in kernel, we need another approach. I think that we should remove the support of FPU in kernel to clean up things. I think that we can assume GCC provide some way of not using FPU for division. Honestly speaking, I couldn't find a good way to extend current implementation to general kernel FPU support (more than one kernel task uses FPU). If we really need this, I think that it is good design general one rather than extending current one. Comments? Opinions? Does anyone have driver which uses FPU? -- |
|
From: Masahiro A. <m-...@aa...> - 2001-08-07 14:20:18
|
I may not be clearly understanding your point but I would like to note... On Sun, 5 Aug 2001 17:26:12 +0900 (JST) NIIBE Yutaka <gn...@m1...> wrote: > Honestly speaking, I couldn't find a good way to extend current > implementation to general kernel FPU support (more than one kernel > task uses FPU). If we really need this, I think that it is good > design general one rather than extending current one. > > Comments? Opinions? Does anyone have driver which uses FPU? RTLinux for SH4 uses FPU. Or more precisely, real-time task running under rtlinux can use fpu. There can be more than one task running simultaneously. I'm taking Summer break now and my brain stopped working ^^;), but I would like to say that: 1) it would be good if linux-sh kernel can handle fpu issues correctly, even if it is compiled with -m4 (without -m4-nofpu option). 2) if it is not possible, rtlinux patch should take care of that. 3) or, I (or somebody) should contribute to what Niibe-san is concerned about now. Here is my opinion. I'll revisit this issue when I get back to work. -- Masahiro ABE, A&D Co., Ltd. |
|
From: NIIBE Y. <gn...@m1...> - 2001-08-08 00:37:44
|
Masahiro Abe wrote: > RTLinux for SH4 uses FPU. Or more precisely, real-time task running > under rtlinux can use fpu. There can be more than one task running > simultaneously. How do you use it? Save all FPU registers at context switch or interrupts? Do we need FPU in the interrupt handlers? If we can assume there's no use of FPU in interrupt handlers, any task (including kernel threads) works well with lazy save/restore. -- |
|
From: Masahiro A. <m-...@aa...> - 2001-08-09 14:44:08
|
On Wed, 8 Aug 2001 09:37:32 +0900 (JST) NIIBE Yutaka <gn...@m1...> wrote: > Masahiro Abe wrote: > > RTLinux for SH4 uses FPU. Or more precisely, real-time task running > > under rtlinux can use fpu. There can be more than one task running > > simultaneously. > > How do you use it? Save all FPU registers at context switch or > interrupts? Well, I did have wrong thing in my mind, and I didn't clearly state the situation. Sorry. RTLinux-SH4 scheduler saves/restores FPU registers at its own context switch time. It has its own mechanism for switching tasks. (It's not lazy FPU save/restore, because I'm lazy.) So, now I think my point stated above is worthless. From RTLinux standpoint, I think Niibe-san's proposal is OK, with one note below: Please don't force to use -m4-nofpu option for compiling kernel. We've had a discussion in Japanese ML that "-m4-nofpu" and "-m4" are not compatible in terms of passing arguments to functions. "-m4-nofpu" does that in SH3 way, while "-m4" does that in SH4 way, which is different. (I can't show you details now, but Kojima-san may remember this discussion.) RTLinux core module and other Realtime task module must use "-m4" to be able to use FPU. "-m4-nofpu" kernel and "-m4" module can't work properly together. So, please consider introducing "-fno-fpudiv" or "-fidiv" for "-m4", as Niibe-san proposed, and make "-m4 -fno-fpudiv" kernel and "-m4" module work properly. Browsing the branch of this thread, my concern may be already addressed. I just wanted to add one additional information. -- Masahiro ABE, A&D Co., Ltd. |
|
From: NIIBE Y. <gn...@m1...> - 2001-08-10 00:22:47
|
Masahiro Abe wrote: > Please don't force to use -m4-nofpu option for compiling kernel. We've > had a discussion in Japanese ML that "-m4-nofpu" and "-m4" are not > compatible in terms of passing arguments to functions. "-m4-nofpu" does > that in SH3 way, while "-m4" does that in SH4 way, which is different. > (I can't show you details now, but Kojima-san may remember this > discussion.) RTLinux core module and other Realtime task module must use > "-m4" to be able to use FPU. "-m4-nofpu" kernel and "-m4" module can't > work properly together. > > So, please consider introducing "-fno-fpudiv" or "-fidiv" for "-m4", as > Niibe-san proposed, and make "-m4 -fno-fpudiv" kernel and "-m4" module > work properly. Exactly. That's the point to switch use of "-m4 and -fno-fpudiv". In short, The flag "-m4-nofpu" is not for SH-4, it's SH-3 variant (Though I think that the name is misnomer). -- |
|
From: Greg B. <gb...@po...> - 2001-08-10 00:32:11
|
NIIBE Yutaka wrote: > > Masahiro Abe wrote: > > [...] We've > > had a discussion in Japanese ML that "-m4-nofpu" and "-m4" are not > > compatible in terms of passing arguments to functions. "-m4-nofpu" does > > that in SH3 way, Aaaaargh! > In short, The flag "-m4-nofpu" is not for SH-4, it's SH-3 variant > (Though I think that the name is misnomer). Certainly sounds like it. So when do get a -fnofpu option? Greg. -- If it's a choice between being a paranoid, hyper-suspicious global village idiot, or a gullible, mega-trusting sheep, I don't look good in mint sauce. - jd, slashdot, 11Feb2000. |
|
From: Greg B. <gb...@po...> - 2001-08-07 23:27:41
|
NIIBE Yutaka wrote: > > The kernel has been changed since, this doesn't work any more, because > of special kernel thread. So far so good, as kernel threads do not do > general things. However, now we have ksoftirqd kernel threads, and it > handles software interrupts (or bottom halves). This means, ksoftirqd > calls routines of drivers, and if we need using FPU in kernel, we need > another approach. So, if we needed the FPU in the kernel, we'd have to switch FPU state on context switch for kernel threads as well as user threads? > I think that we should remove the support of FPU in kernel to clean up > things. Yes. There's no good reason to have FPU support for the kernel. I seem to remember there was some issue with the Dreamcast framebuffer? > I think that we can assume GCC provide some way of not using > FPU for division. That's what -m4-nofpu is for, right? > Honestly speaking, I couldn't find a good way to extend current > implementation to general kernel FPU support (more than one kernel > task uses FPU). If we really need this, I think that it is good > design general one rather than extending current one. Yes, but I don't think we need it. Greg. -- If it's a choice between being a paranoid, hyper-suspicious global village idiot, or a gullible, mega-trusting sheep, I don't look good in mint sauce. - jd, slashdot, 11Feb2000. |
|
From: NIIBE Y. <gn...@m1...> - 2001-08-07 23:56:05
|
Let me explain more. Most easy way to support FPU in kernel is (re)define the context, and save/restore all FPU registers at context switch and exception. But it's waste of time, you know. We've implemented lazy FPU save/restore. As many applications don't use FPU, we could save time not to save/restore FPU registers all the time. We don't save FPU registers for exception handling, interrupts or system call. We only save FPU registersat context switch if it's really used. And we delay the restore of FPU registers until it actually used (by exception). The FPU registers are saved at the context switch, and will be restored on demand. I think that this feature makes sence. I don't think we should remove this. Questionable feature is the one of using init_task as "kernel task" FPU usage. When I've implemented it, I thought it is good idea... When kernel uses FPU, we use init_task to save/restore FPU registers, as if it is user process. We also uses stack to save/restore FPU registers, when kernel goes to handle exception. Greg Banks wrote: > So, if we needed the FPU in the kernel, we'd have to switch FPU state > on context switch for kernel threads as well as user threads? Not exactly. Instead, we should avoid to do that for kernel threads. Currently, we have special code at context switch: if it's not init_task we save FPU register into task structure. I don't know how we can extend... Could be implemented as "it's not kernel threads." The kernel FPU save/resotre is done in the code of fpu.c, when the kernel is got exception or kernel goes back to user mode. Perhaps, we need to implement something to share FPU registers among kernel threads. > > I think that we can assume GCC provide some way of not using > > FPU for division. > > That's what -m4-nofpu is for, right? Yes, that is my intention. However, I'm sorry that it seems it's my misunderstanding. I'm now re-examining our patch of GCC, and it seems for me that -m4-nofpu is special kind of SH-3, not SH-4 (for me it's amazing :-(). I think that we should provide our own switch to disable FPU division. Kaz once has such a patch implemented, perhaps I'll take it. -- |
|
From: Greg B. <gb...@po...> - 2001-08-08 00:33:59
|
NIIBE Yutaka wrote: > > We've implemented lazy FPU save/restore. [...] > > I think that this feature makes sence. I don't think we should remove > this. I agree. > Perhaps, we need to implement something to share FPU registers among > kernel threads. I've got another idea. Let's change the FPU first-use trap handling code to panic if the FPU is used from kernel space. > > > I think that we can assume GCC provide some way of not using > > > FPU for division. > > > > That's what -m4-nofpu is for, right? > > Yes, that is my intention. However, I'm sorry that it seems it's my > misunderstanding. I'm now re-examining our patch of GCC, and it seems > for me that -m4-nofpu is special kind of SH-3, not SH-4 (for me it's > amazing :-(). Huh? I seem to remember we had a problem where the gcc maintainers were making this (incorrect) assumption. So, using -m4-nofpu resulted in __sh3__ being defined instead of __sh4__. But I thought that had been cleared up by now. > I think that we should provide our own switch to > disable FPU division. Kaz once has such a patch implemented, perhaps > I'll take it. Fair enough. Greg. -- If it's a choice between being a paranoid, hyper-suspicious global village idiot, or a gullible, mega-trusting sheep, I don't look good in mint sauce. - jd, slashdot, 11Feb2000. |
|
From: NIIBE Y. <gn...@m1...> - 2001-08-08 00:47:52
|
Greg Banks wrote: > Huh? I seem to remember we had a problem where the gcc maintainers > were making this (incorrect) assumption. So, using -m4-nofpu resulted > in __sh3__ being defined instead of __sh4__. But I thought that had > been cleared up by now. Yes, I had thought that, and have been using -m4-nofpu with su. But looking the code deeply this time, it's SH-3 actually, and __sh3__ definition seems correct. It's quite counter-intuitive, and confusing though. Well, I've objdump-ed the .o produced by -m4-nofpu, and found that the architecture is set to "SH3" not "SH4". Ummm... I don't know what is the actual definition of -m4-nofpu. I belive that it's Hitachi's request to RedHat/Cygnus, possibly for unreleased chips (or custom chips). I'd like to confirm Hitachi people. What' is that? If there's information you could provide us, please let me know. Nozawa-san? I think, it's not safe to rely on our assumption of the definition of -m4-nofpu, as current implementation is not consistent. It's good to have concrete one to specify "no FPU please" option for -m4. -- |
|
From: M. R. B. <mr...@0x...> - 2001-08-08 02:12:52
|
* NIIBE Yutaka <gn...@m1...> on Wed, Aug 08, 2001:
>
> Yes, I had thought that, and have been using -m4-nofpu with su. But
> looking the code deeply this time, it's SH-3 actually, and __sh3__
> definition seems correct. It's quite counter-intuitive, and confusing
> though. Well, I've objdump-ed the .o produced by -m4-nofpu, and found
> that the architecture is set to "SH3" not "SH4". Ummm...
>
This is correct.
> I don't know what is the actual definition of -m4-nofpu. I belive
> that it's Hitachi's request to RedHat/Cygnus, possibly for unreleased
> chips (or custom chips). I'd like to confirm Hitachi people. What'
> is that? If there's information you could provide us, please let me
> know. Nozawa-san?
>
See below.
> I think, it's not safe to rely on our assumption of the definition of
> -m4-nofpu, as current implementation is not consistent. It's good to
> have concrete one to specify "no FPU please" option for -m4.
Well, it's clear-cut what it does in gcc, it generates code for a sh3 and
below, but acts as if it had sh4 hardware:
{"4-nofpu", SH3_BIT|SH2_BIT|SH1_BIT|HARD_SH4_BIT, "" }
gcc won't generate code for anything higher than a sh3, and since the sh3
can't generate fpu code, this is (somewhat) of an alias for a sh4 without
fp support. So the __sh3__ predefine is correct, because if you define
__SH4__ you'll pull in fp code into libgcc, and you'll get link errors.
M. R.
|
|
From: M. R. B. <mr...@0x...> - 2001-08-08 02:07:48
|
* Greg Banks <gb...@po...> on Wed, Aug 08, 2001: > > Huh? I seem to remember we had a problem where the gcc maintainers > were making this (incorrect) assumption. So, using -m4-nofpu resulted > in __sh3__ being defined instead of __sh4__. But I thought that had > been cleared up by now. > There was a thread on this awhile back. m4-nofpu is an alias for the sh3, as the sh3 code in gcc can't generate fpu code. So the name 'm4-nofpu' itself is a misnomer. As always, blame the gcc people ;-). M. R. |
|
From: Greg B. <gb...@po...> - 2001-08-08 06:44:49
|
NIIBE Yutaka wrote:
>
> Yes, I had thought that, and have been using -m4-nofpu with su. But
> looking the code deeply this time, it's SH-3 actually, and __sh3__
> definition seems correct. It's quite counter-intuitive, and confusing
> though. Well, I've objdump-ed the .o produced by -m4-nofpu, and found
> that the architecture is set to "SH3" not "SH4". Ummm...
"M. R. Brown" wrote:
>
> Well, it's clear-cut what it does in gcc, it generates code for a sh3 and
> below, but acts as if it had sh4 hardware:
>
> {"4-nofpu", SH3_BIT|SH2_BIT|SH1_BIT|HARD_SH4_BIT, "" }
>
> gcc won't generate code for anything higher than a sh3, and since the sh3
> can't generate fpu code, this is (somewhat) of an alias for a sh4 without
> fp support. So the __sh3__ predefine is correct, because if you define
> __SH4__ you'll pull in fp code into libgcc, and you'll get link errors.
>
Ok, now I'm more confused than I was yesterday.
Firstly, the gcc I have here (rather old) doesn't define __sh3__
when you give it -m4-nofpu. This might be because we have one of
Kaz' patches. What it does define is:
sh-linux-gnu-gcc --verbose -m3 -o foo-m3.o -c foo.c
[...] -D__SH3__ -D__sh3__ -ml -m3
sh-linux-gnu-gcc --verbose -m4 -o foo-m4.o -c foo.c
[...] -D__SH4__ -ml -m4
sh-linux-gnu-gcc --verbose -m4-nofpu -o foo-m4-nofpu.o -c foo.c
[...] -D__SH4__ -D__SH4_NOFPU__ -ml -m4-nofpu
Which looks pretty sensible to me. The architecture in the ELF files
is pretty screwed up, but it's *consistently* screwed up so everything
still works. The -m4-nofpu option uses the same integer division millicode
as -m3, which is correct.
Secondly, defining __sh3__ is the *wrong* thing to do for the kernel.
One obvious problem is that with __sh3__ defined, lots of stuff in the
kernel breaks completely because the SH internal module registers have
different addresses for SH3 and SH4. For example:
#if defined(__sh3__)
#define TMU_TOCR 0xfffffe90 /* Byte access */
#define TMU_TSTR 0xfffffe92 /* Byte access */
#define TMU0_TCOR 0xfffffe94 /* Long access */
#define TMU0_TCNT 0xfffffe98 /* Long access */
#define TMU0_TCR 0xfffffe9c /* Word access */
#define FRQCR 0xffffff80
#elif defined(__SH4__)
#define TMU_TOCR 0xffd80000 /* Byte access */
#define TMU_TSTR 0xffd80004 /* Byte access */
#define TMU0_TCOR 0xffd80008 /* Long access */
#define TMU0_TCNT 0xffd8000c /* Long access */
#define TMU0_TCR 0xffd80010 /* Word access */
#define FRQCR 0xffc00000
#endif
(I'm using an old kernel too)
What the kernel needs is an option that generates SH4 code with __SH4__
defined, but no FPU code used from libgcc.a. With the gcc I'm using here,
-m4-nofpu is that option.
So I'm very confused. Niibe-san, does your gcc behave differently to mine?
Greg.
--
If it's a choice between being a paranoid, hyper-suspicious global
village idiot, or a gullible, mega-trusting sheep, I don't look
good in mint sauce. - jd, slashdot, 11Feb2000.
|
|
From: NIIBE Y. <gn...@m1...> - 2001-08-08 06:58:35
|
Greg Banks wrote: > Firstly, the gcc I have here (rather old) doesn't define __sh3__ > when you give it -m4-nofpu. This might be because we have one of > Kaz' patches. Yes, this is because of our patch, perhaps I did that. Original implementaiton defines __sh3__ for -m4-nofpu (not __SH4__). I thought that it was mistake, but it seems it is intentional. > sh-linux-gnu-gcc --verbose -m3 -o foo-m3.o -c foo.c > [...] -D__SH3__ -D__sh3__ -ml -m3 > > sh-linux-gnu-gcc --verbose -m4 -o foo-m4.o -c foo.c > [...] -D__SH4__ -ml -m4 > > sh-linux-gnu-gcc --verbose -m4-nofpu -o foo-m4-nofpu.o -c foo.c > [...] -D__SH4__ -D__SH4_NOFPU__ -ml -m4-nofpu > > Which looks pretty sensible to me. Yes, we do that. But the definition seems not that. So, I'd like to ask Hitachi what was the order/request (specification) to RedHat/Cygnus about -m4-nofpu. > The -m4-nofpu option uses the same integer division millicode as > -m3, which is correct. Yes, that's _our_ intention. > Secondly, defining __sh3__ is the *wrong* thing to do for the kernel. Yes. Because we distingush with it. It's not just a kernel for all program, because we share header files of kernel or C library among SH3 and SH4. > What the kernel needs is an option that generates SH4 code with __SH4__ > defined, but no FPU code used from libgcc.a. With the gcc I'm using here, > -m4-nofpu is that option. Yes. > So I'm very confused. Niibe-san, does your gcc behave differently to mine? Mine works exactly as yours. But the original GCC does not. And it seems that it is my misunderstanding that -m4-nofpu is the one for our demands (SH-4 with no FPU division). At that time -m4-nofpu was introduced, because we really wanted such a option, we're just hurry to adopt this option (assuming this *is* for us), changing CPP macro. Now, I think that it's not. I think what we should use is -m4 and say, -fno-fpudiv or -fidiv. The option -m4-nofpu would be the option for some custom model, variant of SH, perhaps. BTW, thanks for the logos. It looks pretty good. -- |
|
From: Greg B. <gb...@po...> - 2001-08-08 07:06:46
|
NIIBE Yutaka wrote: > > Greg Banks wrote: > > So I'm very confused. Niibe-san, does your gcc behave differently to mine? > > Mine works exactly as yours. But the original GCC does not. And it > seems that it is my misunderstanding that -m4-nofpu is the one for our > demands (SH-4 with no FPU division). At that time -m4-nofpu was > introduced, because we really wanted such a option, we're just hurry > to adopt this option (assuming this *is* for us), changing CPP macro. Aha, now I understand. > Now, I think that it's not. I think what we should use is -m4 and > say, -fno-fpudiv or -fidiv. The option -m4-nofpu would be the option > for some custom model, variant of SH, perhaps. Ok. Fair enough. > BTW, thanks for the logos. It looks pretty good. Do you want me to check in the header file? Greg. -- If it's a choice between being a paranoid, hyper-suspicious global village idiot, or a gullible, mega-trusting sheep, I don't look good in mint sauce. - jd, slashdot, 11Feb2000. |
|
From: NIIBE Y. <gn...@m1...> - 2001-08-08 07:12:46
|
Greg Banks wrote: > Do you want me to check in the header file? Yes, please. Thanks for your cooperation. -- |
|
From: YAEGASHI T. <t...@ke...> - 2001-08-08 07:17:04
|
In the article <3B7...@po...>, Greg Banks <gb...@po...> wrote: > > BTW, thanks for the logos. It looks pretty good. > > Do you want me to check in the header file? I've seen the logos, too. Excellent. Please check them in. -- YAEGASHI Takeshi <t...@ke...> <ta...@ya...> |
|
From: Greg B. <gb...@po...> - 2001-08-08 07:40:20
|
NIIBE Yutaka wrote: > > Greg Banks wrote: > > Do you want me to check in the header file? > > Yes, please. Done. Greg. -- If it's a choice between being a paranoid, hyper-suspicious global village idiot, or a gullible, mega-trusting sheep, I don't look good in mint sauce. - jd, slashdot, 11Feb2000. |