From: Borja F. <bor...@gm...> - 2011-04-05 23:44:46
|
Hello Eric, I phrased that in a wrong way sorry :) Basically when i wrote that movws were going to be emitted always i meant that they were going to be emitted directly by the compiler and not manually inserted as it's currently happening. The current implementation searches for 8 bit moves and tries to transform 2 moves in a row to a single movw, but it's missing many cases so that's why they dont get always emitted. With the new implementation, since the compiler is emitting real 16bit moves we will have a movw when it's needed so there's no danger of getting movws lost. About what you mentioned about other devices, indeed this is something that has to be done. I already kept this in mind, thinking about the tiny devices that lack mul, movw, and other instructions support, with 8 bit ptrs, and the largests that have 24?bit ptrs. LLVM has a very nice interface to handle this sort of stuff, so in theory it shouldnt be hard to implement, but really tedious. Basically, first you define a device model say the ATMEGA644PA and in there you can list its supported features, in x86 these features would be SSE1, SSE2, AVX, etc.. in our case we would have MUL, MOVW, ELPM and friends, support etc. Then when emitting the code we check if for example movw is supported for the current device, if it's not supported then LLVM has to take another path and emit a different instruction, or if for example the device doesnt have a builtin multiplier then either expand the mult instr into a chain of adds or make a libcall. I'm really interested in your help when we start supporting other devices to get things right and support every single device in the market. We'll need a good classification of the devices to list these features i mentioned so we can add them for every supported device. For the moment im focusing in the atmega644, which has a lots of features in its CPU core, and then for smaller devices we'll have to add restrictions as explained above to avoid illegal instructions getting emitted, and for larger devices ... well i cant talk about those for now because i've never worked with them. OFFTOPIC: During the past weeks i've been implementing a 4 stage pipelined MIPS core in Verilog for an FPGA for a master class project, and now i really have a good feeling of the beauty of what is inside the RISC cpus. It's nice to have different point of views: the programmer, the compiler, the CPU core hardware... |