|
From: Anthony J B. <net...@nc...> - 2008-01-19 22:00:26
|
On Fri, 18 Jan 2008, Stephen Williams wrote:
> That's always been a problem because of the way that the Verilog
> timing model works, and especially how behavioral and netlist code
> interact in a Verilog design. Multi-threading Verilog simulations
> is highly tricky.
Right, especially given that if it's done wrong, the results of your sim
run could be non-deterministic from run to run. Not necessarily
incorrect, but different or irreproduceable results are equally bad and
will result in unprecedented levels of fanmail. =)
I've been putting it off for years, but sometime I do need to look at the
generated vvp to see what kinds of event cones (for lack of a better word)
can be extracted. (I don't mean basic blocks.) A lot of those bitwise
logic ops can be done as a branchless SWAR op assuming you split your
value into bitwise parallel 01 and XZ value planes. (SWAR = SIMD within a
register.) There's more going on calculation-wise but the odds of a
branch mispredict flush are zero as no branches are required. For
example, run the program below. I only print one bit, but by extension
you can go as wide as you want. This is exactly what VCS does if you look
at the generated C code while gcc is in the middle of chewing on it.
-Tony
#include <stdio.h>
/*
* 00 '0'
* 01 '1'
* 10 'Z'
* 11 'X'
*/
#define MVL(a,b) "01ZX"[((a)&1)*2+((b)&1)]
#define ITER2 for(t3=0;t3<2;t3++) for(t4=0;t4<2;t4++) {
#define TAIL2 t0&=1; t1&=1; printf("%d%d -> %d%d \t ~%c -> %c\n", t3,t4,
t0, t1, MVL(t3,t4), MVL(t0, t1));\
}printf("\n");
#define ITER4 for(t3=0;t3<2;t3++) for(t4=0;t4<2;t4++) for(t5=0;t5<2;t5++)
for(t6=0;t6<2;t6++) {
#define TAIL4 t0&=1; t1&=1; printf("%d%d & %d%d -> %d%d \t %c & %c ->
%c\n", t3,t4, t5, t6, t0, t1, MVL(t3,t4), MVL(t5,t6), MVL(t0, t1));\
}printf("\n");
int main(int argc, char **argv)
{
int c, d, t0, t1, t3, t4, t5, t6;
printf("NOT\n===\n");
ITER2
t0=t3;
t1=(t3|~t4);
TAIL2
printf("AND\n===\n");
ITER4
d=(t3|t4)&(t5|t6);
t0=d&(t3|t5);
t1=d;
TAIL4
printf("OR\n==\n");
ITER4
c=(t3^t5)^((t3|t4)&(t5|(t6&t3)));
t0=c;
t1=((t4|t6)|c);
TAIL4
printf("XOR\n===\n");
ITER4
c=t3|t5;
t0=c;
t1=(c|(t4^t6));
TAIL4
printf("NAND\n====\n");
ITER4
d = (t3 | t4) & (t5 | t6);
c = d & (t3 | t5);
d = c | (~d);
t0 = c;
t1 = d;
TAIL4
printf("NOR\n===\n");
ITER4
c = (t3 ^ t5) ^ ((t3 | t4) & (t5 | (t6 & t3)));
d = c | (~((t4 | t6) | c));
t0 = c;
t1 = d;
TAIL4
printf("XNOR\n====\n");
ITER4
c = t3 | t5;
d = c | (~(c | (t4 ^ t6)));
t0 = c;
t1 = d;
TAIL4
exit(0);
}
|