|
From: David O. <da...@ga...> - 2000-07-08 01:43:16
|
On Thu, 06 Jul 2000, Dean Bilotti wrote: > Yep I tried rotation too. 3 clocks. :) > > I did think about the Pentium superpipelines and superscaling. That's why I > did some tests and timed them. :) The difference in clock counts for a 486 > processor seemed to be very much in line with real life Pentium > Performance. Did you try breaking out loops and interleaving use of different registers? Instructions will have to stall until they get their input data, but you can usually squeeze in some other operation in the meantime. A good rule is to do as many things at once as possible without running out of registers, and interleave operations to wait as long as possible before requiring the result of an instruction. That usually makes quite a difference, and it's also the reason why it's not trivial to hack better code than a good compiler - this coding style tends to generate very messy code... Pipelining totally relies on the code being physically possible to parallelize within the pipeline deepth, as CPUs unfortunately still lack TTUs. (Time Travel Units) ;-) BTW, MMX, KNI and co is even more sensitive to this, as their cores are usually better optimized for pipelining than the normal CPU core. (It gets easier when conditional code is normally handled with true/false masks rather than skips.) As an example, Intel MMX multiplications can be fully pipelined, as opposed to the standard ALU version. They take 3 cycles to finish (just as the normal MUL), but you can start a new one every cycle, as long as the data is available. (Can't figure out how you'd feed like that continously, but at least, it eliminates one or two stall cycles every now and then.) //David .- M u C o S --------------------------------. .- David Olofson ------. | A Free/Open Multimedia | | Audio Hacker | | Plugin and Integration Standard | | Linux Advocate | `------------> http://www.linuxdj.com/mucos -' | Open Source Advocate | .- A u d i a l i t y ------------------------. | Singer | | Rock Solid Low Latency Signal Processing | | Songwriter | `---> http://www.angelfire.com/or/audiality -' `-> da...@li... -' |