Re: [Iverilog-devel] Parallelization with GPU

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

When I looked at this a few years ago it seemed like the GPU code/data model was too restrictive to provide much benefit. The newer GPUs may have more flexibility so realize my comments/conclusions may be slightly out of date.

I'm not certain how much benefit you can get from a GPU with an event driven simulation. You may be able to speed up specific operations if they can be forced to fit the GPU code/data model (e.g. large vector multiplication, division, modulus, power, etc. may benefit).

If the non-determinism and partitioning can be dealt with multi-threaded (multi-CPU) simulations could be beneficial, but that is a change relative to your original direction.

If you want to stick with a GPU approach you can create a simple code example using the current vvp_vector2_t class to test the speed of the various operations with various width data and then see if using the same vvp_vector2_t input data you can make a faster operation using the GPU for wide vector operations (vectors that have more than BITS_PER_WORD bits). The procedural vector operators are currently in flux since Steve is currently changing them to be stack based. When he is done with this they should also be looked at. The current operators may not be fully optimized so keep that in mind when looking at this (e.g. it may be better to optimize the current operator vs going to a GPU approach). Since this is likely a learning/educational endeavor for you this may not be that important, but you do need to keep it in mind regarding any conclusions you make. The most valid conclusion would be based on the comparison of two fully optimized routines. No
 matter your conclusion we would benefit if the current routines could be optimized.

On Wednesday, March 5, 2014 10:56 PM, Arun P <aru...@gm...> wrote:

Thanks a lot Jared and Steve for your kind response. Due to time constrain , I thought I can look for a function or block where the complex computation occurs. 

Now, as you have suggested I want to understand how the vvp runtime works. Is there any proper documentation to understand the flow of the simulation process ? so that I can know where the computation process occurs.( Is there any function or file involving complex computation  ? )  

Looking from the higher level , this might not give significant performance gain. But I think I can start with this to explore GPU processing. Looking for your suggestions. 

Thanks,
Arun. 

------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Iverilog-devel mailing list
Ive...@li...
https://lists.sourceforge.net/lists/listinfo/iverilog-devel