Re: [myhdl-list] MyHDL logic synthesis
Brought to you by:
jandecaluwe
From: Newell J. <pil...@gm...> - 2009-03-24 03:24:52
|
On Mon, Mar 23, 2009 at 4:58 AM, Jan Decaluwe <ja...@ja...> wrote: > Newell Jensen wrote: > > Jan, > > > > Have you ever thought of giving MyHDL synthesis capabilities? > > No. > > Synthesis, the way I define it, would be a formidable task, certainly > beyond my capabilities as an open-source developer. > > In addition to powerful HDL inference capabilites and logic minimization, > my definition of synthesis includes timing-driven optimization with an > integrated timing analyzer and powerful technology mapping, ideally > including placement info from an integrated P&R tool. > > Morever, what would be the value proposition? Today I get these tools > basically for free from Xilinx and Altera for their architecture. > > > Icarus Verilog can synthesis designs and personally, I would find it a > huge > > advantage to not have to go the MyHDL --> Verilog route as there are > > many things that are not convertable such as delays etc. > > I respect Stephen Williams very much and the value of Icarus the simulator > if very clear to me, including its tremendous value to the MyHDL project. > However, I don't see this for Icarus the synthesis tool. > I don't believe it matches my definition of a synthesis tool. It's kind of > hard to judge, as there is virtually no documentation that I can find, > but I'm pretty sure we would have heard about it otherwise. > > So I suspect that after Icarus "synthesis" some other tool still has > to perform some tasks (e.g. timing optimization) that I consider part > of synthesis. In other words, it probably gives you an entry point > at a somewhat lower level than Verilog RTL, in a tool flow that you > have to run anyway, and which is basically free anyway. > Again, what's the point? There are a couple of points. As an example, consider the following piece of code that would most likely be used in a software implementation for finding the third power of X. Note that the term “software” here refers to code that is targeted at a set of procedural instructions that will be executed on a microprocessor. XPower = 1; for (i=0;i < 3; i++) XPower = X * XPower; Note that the above code is an iterative algorithm. The same variables and addresses are accessed until the computation is complete. There is no use for par- allelism because a microprocessor only executes one instruction at a time (for the purpose of argument, just consider a single core processor). A similar implemen- tation can be created in hardware. Consider the following Verilog implementation of the same algorithm (output scaling not considered): module power3( output [7:0] XPower, output finished, input [7:0] X, input clk, start); // the duration of start is a single clock reg [7:0] ncount; reg [7:0] XPower; assign finished = (ncount == 0); always@(posedge clk) if(start) begin XPower <= X; ncount <= 2; end else if(!finished) begin ncount <= ncount - 1; XPower <= XPower * X; end endmodule In the above example, the same register and computational resources are reused until the computation is finished as shown in Figure 1.1. With this type of iterative implementation, no new computations can begin until the previous computation has completed. This iterative scheme is very similar to a software implementation. Also note that certain handshaking signals are required to indicate the beginning and completion of a computation. An external module must also use the handshaking to pass new data to the module and receive a completed calculation. The performance of this implementation is Throughput 1⁄4 8/3, or 2.7 bits/clock Latency 1⁄4 3 clocks Timing 1⁄4 One multiplier delay in the critical path Contrast this with a pipelined version of the same algorithm: module power3( output reg [7:0] XPower, input clk, input [7:0] X ); reg [7:0] XPower1, XPower2; reg [7:0] X1, X2; always @(posedge clk) begin // Pipeline stage 1 X1 <= X; XPower1 <= X; // Pipeline stage 2 X2 <= X1; XPower2 <= XPower1 * X1; // Pipeline stage 3 XPower <= XPower2 * X2; end endmodule In the above implementation, the value of X is passed to both pipeline stages where independent resources compute the corresponding multiply operation. Note that while X is being used to calculate the final power of 3 in the second pipeline stage, the next value of X can be sent to the first pipeline stage as shown in Figure 1.2. Both the final calculation of X3 (XPower3 resources) and the first calculation of the next value of X (XPower2 resources) occur simultaneously. The perform- ance of this design is Throughput 1⁄4 8/1, or 8 bits/clock Latency 1⁄4 3 clocks Timing 1⁄4 One multiplier delay in the critical path The throughput performance increased by a factor of 3 over the iterative implementation. In general, if an algorithm requiring n iterative loops is “unrolled,” the pipelined implementation will exhibit a throughput performance increase of a factor of n. There was no penalty in terms of latency as the pipelined implementation still required 3 clocks to propagate the final computation. Like- wise, there was no timing penalty as the critical path still contained only one multiplier. So in the end....if I am going to make optimisations and go through the trouble of fine tunning things hopefully there is a direct mapping in the conversion. Just to see if this was so I went and tried to code up the second example here in MyHDL. I got it to convert fine, but what I got fails during synthesis for Xilinx ISE 10.1 (service pack is the latest as well). This is my myhdl module and following it is the conversion: from myhdl import * def power3( XPower, X, clk): @always(clk.posedge) def logic(): XPower1 = intbv(min=0, max=256) XPower2 = intbv(min=0, max=256) X1 = intbv(min=0, max=256) X2 = intbv(min=0, max=256) # Pipeline stage one X1.next = X XPower1.next = X # Pipeline stage two X2.next = X1 XPower2.next = XPower1 * X1 # Pipeline stage three XPower.next = XPower2 * X2 return logic def convert(): XPower, X = [Signal(intbv(0)[8:]) for i in range(2)] clk = Signal(bool(0)) toVerilog(power3, XPower, X, clk) convert() ############################################# // File: power3.v // Generated by MyHDL 0.6 // Date: Mon Mar 23 20:15:45 2009 `timescale 1ns/10ps module power3 ( XPower, X, clk ); output [7:0] XPower; reg [7:0] XPower; input [7:0] X; input clk; always @(posedge clk) begin: POWER3_LOGIC reg [8-1:0] X2; reg [8-1:0] X1; reg [8-1:0] XPower2; reg [8-1:0] XPower1; XPower1 = 0; XPower2 = 0; X1 = 0; X2 = 0; X1 <= X; XPower1 <= X; X2 <= X1; XPower2 <= (XPower1 * X1); XPower <= (XPower2 * X2); end endmodule The Error that I am getting is --> Cannot mix blocking and non blocking assignments on signal So, maybe there is a way to define intermediate registers within the file outside of the blocks?? I tried using an @instance block as well but got the same results. Any ideas?? This is one of the headaches of switching between one HDL and another.... > > > If I'm wrong, we can always wrap a synthesize() function around > the Icarus engine :-) > > From your question I infer that you assume that a direct synthesis > flow from MyHDL would somehow remove some synthesis-related restrictions > But that is not true. The restrictions would be just the same as > today. They are there for Icarus synthesis also, believe me. > > However, those "synthesis restrictions" are in fact badly explained in > text books. So what could be meaningful is to write a guide for > "Efficient synthesis with MyHDL". (If I would do that, it would be > totally different from what you read today. I would basically start > with synchronous processes and flip-flop inferencing from variables.) > > > I haven't looked into what would need to happen to make this happen but > > I wanted to ask you to see what you thought about this. > > > > Personaly, writing everything in Python would be a dream come true. > > For all practical purposes, for me this dream is true today. After a > project > is properly setup, conversion is hidden somewhere in a Makefile right > before > synthesis. Verilog is just one of the many back-end formats used by the > back-end tools needed to go from MyHDL to an implementation. That's how > I see it, and it works fine. > > Jan > > -- > Jan Decaluwe - Resources bvba - http://www.jandecaluwe.com > From Python to silicon: > http://www.myhdl.org > > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > myhdl-list mailing list > myh...@li... > https://lists.sourceforge.net/lists/listinfo/myhdl-list > -- Newell http://www.gempillar.com Before enlightenment: chop wood, carry water After enlightenment: code, build circuits |