From: Anthony J B. <net...@nc...> - 2008-01-19 22:00:26
|
On Fri, 18 Jan 2008, Stephen Williams wrote: > That's always been a problem because of the way that the Verilog > timing model works, and especially how behavioral and netlist code > interact in a Verilog design. Multi-threading Verilog simulations > is highly tricky. Right, especially given that if it's done wrong, the results of your sim run could be non-deterministic from run to run. Not necessarily incorrect, but different or irreproduceable results are equally bad and will result in unprecedented levels of fanmail. =) I've been putting it off for years, but sometime I do need to look at the generated vvp to see what kinds of event cones (for lack of a better word) can be extracted. (I don't mean basic blocks.) A lot of those bitwise logic ops can be done as a branchless SWAR op assuming you split your value into bitwise parallel 01 and XZ value planes. (SWAR = SIMD within a register.) There's more going on calculation-wise but the odds of a branch mispredict flush are zero as no branches are required. For example, run the program below. I only print one bit, but by extension you can go as wide as you want. This is exactly what VCS does if you look at the generated C code while gcc is in the middle of chewing on it. -Tony #include <stdio.h> /* * 00 '0' * 01 '1' * 10 'Z' * 11 'X' */ #define MVL(a,b) "01ZX"[((a)&1)*2+((b)&1)] #define ITER2 for(t3=0;t3<2;t3++) for(t4=0;t4<2;t4++) { #define TAIL2 t0&=1; t1&=1; printf("%d%d -> %d%d \t ~%c -> %c\n", t3,t4, t0, t1, MVL(t3,t4), MVL(t0, t1));\ }printf("\n"); #define ITER4 for(t3=0;t3<2;t3++) for(t4=0;t4<2;t4++) for(t5=0;t5<2;t5++) for(t6=0;t6<2;t6++) { #define TAIL4 t0&=1; t1&=1; printf("%d%d & %d%d -> %d%d \t %c & %c -> %c\n", t3,t4, t5, t6, t0, t1, MVL(t3,t4), MVL(t5,t6), MVL(t0, t1));\ }printf("\n"); int main(int argc, char **argv) { int c, d, t0, t1, t3, t4, t5, t6; printf("NOT\n===\n"); ITER2 t0=t3; t1=(t3|~t4); TAIL2 printf("AND\n===\n"); ITER4 d=(t3|t4)&(t5|t6); t0=d&(t3|t5); t1=d; TAIL4 printf("OR\n==\n"); ITER4 c=(t3^t5)^((t3|t4)&(t5|(t6&t3))); t0=c; t1=((t4|t6)|c); TAIL4 printf("XOR\n===\n"); ITER4 c=t3|t5; t0=c; t1=(c|(t4^t6)); TAIL4 printf("NAND\n====\n"); ITER4 d = (t3 | t4) & (t5 | t6); c = d & (t3 | t5); d = c | (~d); t0 = c; t1 = d; TAIL4 printf("NOR\n===\n"); ITER4 c = (t3 ^ t5) ^ ((t3 | t4) & (t5 | (t6 & t3))); d = c | (~((t4 | t6) | c)); t0 = c; t1 = d; TAIL4 printf("XNOR\n====\n"); ITER4 c = t3 | t5; d = c | (~(c | (t4 ^ t6))); t0 = c; t1 = d; TAIL4 exit(0); } |