From: Carlos S. de La L. <car...@ur...> - 2011-12-19 11:43:43
|
>> Also the final binary, optionally. > > OK. In our case the kernel just might have multiple versions for > the multiple dimensions in the .text section. Should work... Or even multiple version of the binary in different sections. >>> The OpenCL API for fetching and loading the program binaries is >>> multi-device. >>> Thus the format should not be tied to an architecture as it can >>> contain the >>> same kernels compiled for multiple devices. >> >> What does this mean? > > I think it means that for example in case of AMD you could have > the CPU and the GPU (device) versions of the program in the same > (OpenCL) binary. I see from the specs that they do not support this but > store only the GPU or CPU bits but not both: > > "By default, OpenCL generates a binary that has LLVM IR, AMD IL, and the > executable for the GPU (,.llvmir, .amdil, and .text sections), as well as > LLVM IR and the executable for the CPU (.llvmir and .text sections)."? Given our LLVM IR format can be linked to an device-dependant architecture, we are not going to support binary retargeting anyways so we should not bother about that. > ELF has only one architecture-specific .text section, IIUC so it would > not work for this. Again, ELF is used in BIF just as a wrapper, so you can create a ".myownstuff" section and put whatever you want inside. There is no need for it to be in the ".text" >> Any other option (tar/zip/ELF/whatever) would do the same, but as this >> is documented and used on a OpenCL SDK I would suggest doing the same. > > I do not consider the main advantage to be that it's used by AMD. But > in case it can be used as a directly dlopenable program binary then it's > a real advantage (BTW in MacOS or at least Windows we might need something > else then?). It would avoid the objcopy step in case the binary contains > a kernel version suitable for launching directly for the given > dimensions... probably a small saving but still a nifty thing to have. If we want to be able to dlopen the binary directly then we need something like this FatELF... but as you found out, the project seems to be half dead, dlopen is not going to support FatELF binaries in almost any system so we would end up with more stuff to fix ourselves. I would go for the "keep it simple" way. Carlos |