Re: [Tack-devel] back-end tables, assemblers, and front-ends
Moved to https://github.com/davidgiven/ack
Brought to you by:
dtrg
From: Gregory T. (t. K. <gt...@di...> - 2006-07-20 13:25:25
|
At 8:28 AM -0400 7/20/06, David Given wrote: >Long-term? I've been going at this for a year and a half, now, so I >suspect that simply producing a usable release counts as long term... So after your two years of hard work (granting that perhaps another six= months of work is ahead), and ten (twenty?) years of existence, ACK'll be= an overnight sensation :-) >But I don't really have anything in the way of specific long-term >planning. I just want to get it usable enough to gain users. In its >current state, the 5.6 release is simply too hard to set up, and I'm >hoping to improve that. With luck, there'll be enough users to develop a >certain amount of maintenance momentum --- although I doubt the ACK will >topple gcc, I do think that it can fill a very valuable niche in the >compiler ecosystem. I am convinced that an end-to-end BSD-licensed toolchain will be adopted by= at least one BSD based OS. It might not be one of the big three, but it= will be adopted. I think there has to be very careful monitoring of "let's= see how gcc/gas/ld does it" comments down the road. =20 >Plus, it's an amazingly elegant design, and it would be a terrible shame >if all that work died due to bit rot. Agreed. >[...] >> Ok. It occurs to me that ACK was written before there was extensive >> pipelining in CPUs. It also seems to me that a stack-based virtual >> machine isn't going to be aware of pipelining. > >*nods* > >There seems to be exactly one usage of the word 'pipeline' in the entire >source tree in the CPU sense, and it's the bit in the ARM assembler that >calculates branch offsets... On PowerPC, some operations that set conditional register values (for= branching) can take four cycles and at least one of them (which escapes me= at the moment) takes seven (and my apologies if I have the specifics= incorrect). Most, if not all, of the algebraic operations have variants= that can set the conditional registers at the same time as the operation. = This helps optimize some of the pipelining by recognizing that a= subtraction, for example, might later need to compare if the result was= less than zero. Both POWER and PowerPC chips benefit greatly from having= something constantly going on, so grinding to a halt waiting for the= results of a comparison really hurts performance. Having good= communication between the peephole optimizer (or the appropriate component= that transmits the messages about register usage) and the assembler= regarding looking ahead is going to be critical. Then there are the floating point registers, which have to be aligned on 8= byte boundaries or else the processor generates a floating point exception.= And Altivec needs 16 byte alignment or it just chops the data (and can= overwrite existing data with garbage).=20 Oh - does endian-ness affect ACK? PPC is big-endian, with a little-endian= mode that is generally avoided whenever possible.=20 >The good news is I now have my work tree in sufficient state that I can >compile a C program into ARM assembler! Excellent! >The bad news is that when I tell it to compile for i386, it still >generates ARM assembler... there's a bug in pm that's causing it to pick >the wrong file when installing each stage. So, everything *compiles*, >but it doesn't get put together correctly. I'm looking into this. >Hopefully once I've tracked that down I can declare the new build system >to be sufficiently working to be useful. Is there a target arch flag? Pardon my ignorance, but what is pm? >This is with opt (the global peephole optimiser), but without ego (the >really heavyweight global optimisers) or top (the target peephole >optimiser), so the generated code is pretty awful. Also, I suspect that >the ARM code generator is rather poor. It appears to want to pass all >parameters on the stack, which disturbs me a little. Which of course introduces the question of ABIs. What ABIs and file formats= are already available in ACK? >Unoptimised i386 assembler (after manual hacking to get the code >generator in place): <snip> >There's something very odd going on in that loop. I'm afraid I can't help you with the x86 code, nor the ARM code (which at= least looks something like PPC). I am still working up to grasping the= components to ACK and I haven't had a chance to look at the em document. = It is high on the list of to-do. tim Gregory T. (tim) Kelly Owner Dialectronics.com P.O. Box 606 Newberry, SC 29108 "Anything war can do, peace can do better." -- Bishop Desmond Tutu |