From: Carlos S. de La L. <car...@ur...> - 2011-10-24 13:48:16
|
Hi Erik, On Mon, 2011-10-24 at 09:20 -0400, Erik Schnetter wrote: > 2011/10/24 Carlos Sánchez de La Lama <car...@ur...>: > > Hi Erik, > > > > thanks for your contributions. Some of the questions you ask are open for discussion, as the kernel library implementation is still in a really early stage of development. > > > >> - Is this implementation / coding style approximately acceptable? > > > > It is all right. We do not have an "official" coding guideline for pocl (yet) but basically I tried to follow GNU Coding Standards, except in the LLVM passes where the LLVM guidelines are used. > > > > I have seen you call the C library functions to implement the functionality. This is ok for "native" environments, but in a real device scenario, there is not going to be an underlying C library providing for example "cos" function. One possible option is to use some other embedded library (newlib?) underneath our kernel library, thought I have the feeling this might be overkill and make too big device binaries. Otherwise we would need to implement the "cos", "sin", whatever... functions completely in the kernel library. > > > > Thoughts? > > To get an efficient implementation, we will need to have something > hardware-dependent; a cross-platform implementation can only be a > fallback. It was my impression that POCL currently concentrates on > running on the host, where a libc is always available. POCL concentrates on static parallel architectures, really. Only the host "driver" comes in the source code but POCL itself started as the development of OpenCL support for TCE project (http://tce.cs.tut.fi). We generalized the passes to make it portable, and more widely useful. > For example, on Intel, one would want to use the fsin machine > instruction, and there are similar machine instructions (or sequences > thereof) for other architectures. > > For the fallback implementation, I would definitely use an existing > library. I don't know newlib, but it may be the way to go. Of course. Even newlib would not be target independent, it is compiled for a given architecture, and has its own set of processor dependent implementations of math functions (about their level of efficiently I am not sure). Agreed on the need for different library implementations. > >> - Since some of the code is highly repetitive, should we use a > >> templating mechanism, probably built onto m4? > > > > If it is *really* needed, then we can use it. But I am not sure how much effort it saves compared with the C preprocessor. If we want to make the system as portable as possible, it is desirable to keep dependencies to a minimum. > > > > Can provide a (simple) example of a case in the kernel library where M4 macros would help? > > It could provide a mechanism that is not per-file, but more generic. > For example, the implementations of sin, cos, tan, etc. are very > similar. In other words, OpenCL doesn't have #include (or does it when > the kernel library is built?), and using only #define leads to a lot > of duplication across source files. We have #include and #define and all of those during library building process. I know sin/cos and other math functions are quite repetitive, my doubt is whether the amount of "saved work" with m4 over plain C preprocessor justifies the dependency on m4. For example cos/sin/tan can be the same C file using a generic FUNC macro (lets say trig.inc) and: #define FUNC cos #include "tric.ing" #define FUNC sin #include "trig.inc" ... Again, if there is a clear gain using m4, no problem, but i am unsure of it really saving so much typing. BR, Carlos |