Re: [myhdl-list] MyHDL performance
Brought to you by:
jandecaluwe
From: Jan D. <ja...@ja...> - 2006-11-30 09:46:53
|
Martin d Anjou wrote: > Hi, > > I still hesitate to use MyHDL because of performance concerns for doing > ASIC verification (I have close to zero interest in design). I have been > using "a proprietary verification language" for quite some time now. > Recognizing limitations of this proprietary solution, I've been exploring > alternatives like C++ trusster.com, Digital Mars D StackThreads > (assertfalse.com), Java Jove, and of course MyHDL for its ease of use. > > For example I have run a benchmark using the MyHDL producer-consumer and > the gray encoder in the documentation. The producer-consumer example is as > fast as an equivalent implementation in "the proprietary language". The > gray encoder is 10x slower, but I think the intbv is the bottleneck. I wouldn't be suprized if some intbv operations such as indexing turn out to be quite slow. > For code size, MyHDL producer-consumer beats the proprietary solution by a > great margin, and for gray coding the code sizes are almost equal. I have > not looked at memory footprint. > > Do you think it is worth converting intbv or other aspects of MyHDL to C? Converting the intbv class and possibly the Signal class to C may make a lot of sense. It wouldn't be a small task, so this should be confirmed by experiments. Perhaps it's possible to start with those functions that are really slow, instead of the whole class at once. > Have you explored ways to speed up MyHDL (greenlet, stackless python - > which is still active)? Yes - some comments: Greenlets, stackless python and generators have in common that they can model light-weight, massive parallelism. Greenlets and stackless are more powerful, but I believe generators are good enough for HDL purposes. The big advantage of generators is that they are standard Python, unlike stackless; so I can develop myhdl as a conventional python package that works with the standard interpreter. I wouldn't anticipate a performance advantage from using greenlets or stackless. In all cases, we also need a rather strict scheduler with a significant overhead: that is, we have to implement the delta cycle algorithm and signals on top of the parallelization method. At one time I experimented with converting the main simulator loop to C. Basically I had replaced Python access to high-level data structures with C API access to high-level data structures. That doesn't help a lot. Rather, performance advantages can be expected when you can use low-level C types instead of Python types. Another approach has been more succesful. Observe that MyHDL generators can be sensitive to a number of different objects (edge, tuple of edges, signal, tuple of signals, delay ...). In general, the scheduler has to take all possibilities into account each time a generator yields. This is inefficient because many generators will be sensitive to a single kind of object only. So what has been done is that the code of each generator is inspected (yes!) before the simulation starts. Based on that, the generator gets a dedicated scheduler which is efficient for its sensitivity purposes. This technique removes a lot of simulation overhead for typical usage cases. What next? A meaningful next step could be to migrate some functionality to C, as you suggested. Finally, there is PyPy. I once saw a demo by Armin Rigo showing a massive speedup by using psyco. Unfortunately, psyco cannot handle generators. Instead, Armin and others started the PyPy project that at one time may bring psyco-like advantages to general Python code. It would seem to me that MyHDL is a good candidate, because a lot of code is run over and over again during simulation. So perhaps one day I'll be able to report a massive speedup without having to do anything myself :-) That will be the day! Jan -- Jan Decaluwe - Resources bvba - http://www.jandecaluwe.com Losbergenlaan 16, B-3010 Leuven, Belgium From Python to silicon: http://myhdl.jandecaluwe.com |