[Datadraw-user] Parallel processing in 42
Brought to you by:
smilindog2000
From: Bill C. <bi...@bi...> - 2008-12-29 15:05:08
|
On Mon, 2008-12-29 at 11:27 +0100, Questor Fused wrote: > 3.) multithreading: the language should support mechanisms to support multiple threads. two paradigms I like: the intel-threading-building-blocks-approach and the csp2-approach. itbb is about work-packages that are handled parallel and csp is about channels and doing the communication via channels between the threads... both are high-level-things and perhaps better in libraries. both paradigmas can be programmed via an event-system. A silly goal I have is to compile my 42 programs into FPGA based reconfigurable computers, and actually run faster than if I were running on Intel processors. Today, this only works for signal processing examples. Generally, Intel processors win. The reason for this is simple: people focus on the wrong problem. We don't need to be able to do 1000 multiply operations in parallel. What's needed is 100 parallel memory controllers all running at full speed. DataDraw separates object properties into separate arrays which can be assigned to different memories on different memory controllers. This can allow multiple memory controllers to run in parallel efficiently to speed up inner loops, and no parallel programming is required. In theory, this would allow Xilinx to finally squash Intel for generic computation, since big Xilinx FPGAs have the ability to access many DDR2 SRAM banks in parallel. I'd love to compile my EDA tools onto Xilinx hardware and run faster! So, parallel processing is very much a goal of 42. Here is my new list of 42's primary goals: - foster extreme code reuse - run much faster than C - compile to both software and hardware - run faster on reconfigurable computers than WinTel boxes - allow users to extend the language I'm an optimistic sort :-) However, I have reasons for thinking it is doable. DataDraw shows that 42 can beat C in speed, and IMO it also shows 42 can beat C++ in code reuse. You probably have to take my word for it that 42 will be compilable to hardware, but I think I'm in a good position relative to almost anyone to make it happen. Here's my shameless self promotion for why I'm the right guy to write 42: - I've been doing hardware synthesis for about 16 years - I've have written several compilers and interpreters - I'm the primary author of DataDraw - I've been dreaming of writing a syntax-extensible compiler since high school, and have written tons of lex/yacc code - I am foolishly optimistic I read up a bit on Intel's "building blocks", and C++CPS2. 42 programs at the top level are processes, similar to Verilog/VHDL, and each process can run in parallel. The event processor will assign processes to execution threads, and will optimize the number of parallel threads, just like Intel's blocks. C++CSP2 methodology is for each thread to have it's own private data, and only communicate through channels. Channels are similar to 42's signals, so C++CSP2 style programming will also be supported. I have some stories about trying to make EDA algorithms multi-threaded. At company S, they sell a chip DRC checker. Their first attempt to make use of multiple processors was for Sun Micro's symmetric multiprocessor machines. After a multi-man-year effort, the DRC checker was finally multi-threaded. When they tested it, the tool ran slower! What they failed to realize was that their primary bottleneck was access to memory, and by having multiple threads fighting over it, their program slowed down rather than speeding up. A program with a common database running on a symmetric multiprocessor is very different than multiple processors working on separate databases, and communicating when needed. The C++CSP2 mechanism could work OK for processes that have local rather than shared databases, and could allow processes to run across a network, or on multiple FPGAs on a multi-FPGA reconfigurable computer. That would work well for DRC checking. It might even work well for placement, if you use some sort of partitioning placer, but it wont work at all for quadratic placers. This would require an ability to have multiple databases of the same type in the same program at the same time, something which currently is not supported in DataDraw, but probably should be. I'll try to work it into the semantics of 42 when I iron out details. Regards, Bill |