|
From: Chris R. <cr...@gm...> - 2018-07-07 17:42:05
|
Experiment:
- Faster 3D in Golly using 3D.lua
- Chris Rowett (cr...@gm...)
- July 2018
Context:
3D.lua is an excellent Lua script written by Andrew Trevorrow which allows
3D rules to be explored in Golly.
Several techniques have been used already to improve the performance of
3D.lua on the two most expensive functions:
1) Computing the next generation
2) Rendering the cells
Purpose:
The purpose of this experiment was to test porting the Lua NextGeneration
functions in 3D.lua to C++ and calling those from 3D.lua instead.
I was specifically interested in two things:
1) How easy is it to port?
2) What performance improvement (if any) might be gained?
What did I do:
- Translated the NextGeneration function for the different rule types
(Moore, Face, Corner, Edge, Hexahedral and BusyBoxes) from Lua into C++.
- Made the new implementation available to Lua via some new "ovtable"
commands ("nextgen3d", "setrule3d" and "setsize3d") in Golly.
- Modified 3d.lua to use the new "ovtable" commands.
The test:
1. I ran the modified 3d.lua in the modified Golly.
2. I set the grid size to 100 (View > Set Grid Size... 100)
3. I filled the grid completely (File > Random Pattern... 100f) giving a
population of 1000000
4. For each rule type I set a rule where cells would always survive: e.g.
3D0..26/ for Moore
5. I then ran the pattern and watched the timing.
Test platform:
Core i7 2600K quad core, HT @4.3GHz
16Gb RAM
Windows 10 64bit
Golly 3.2 (modified) 64bit
3d.lua (modified)
Results:
1) Ease of porting
The port was pretty straightforward. I added a Table class to simulate
a simple Lua table and that made the code look fairly similar.
The only two gotchas were:
- C++ handles the mod operator "%" differently than Lua for negative
numbers.
- C++ arrays start at 0, Lua arrays start at 1
2) Performance
Timings are in ms and are comparing the original Lua version with the new
Lua/C++ version.
Algo Rule Lua Lua/C++ Speedup
Moore 3D0..26/ 820 75 11x
Face 3D0..6/F 812 71 11x
Corner 3D0..8/C 916 78 12x
Edge 3D0..12/E 1178 84 14x
Hexahedral 3D0..12/H 1155 83 14x
BB phase 1 BB 3646 138 27x
BB phase 6 BB 4239 138 31x
The experiment gave an 11x to 14x speed improvement for the
standard neighborhood rules.
BusyBoxes was 27x to 31x faster depending on which of the 6 phases
was being calculated. I picked the slowest phase and the fastest phase
for the results.
Conclusion:
For the case examined the effort was definitely worth the result.
An expensive, standalone function, was translated.
The impact on 3D.lua was minimal and the performance improvement
significant.
Files available at:
http://lazyslug.no-ip.biz/lifeview/golly/Golly3D.zip
Golly.exe
- Windows 64bit executable with the new ovtable commands
3D.lua
- Modified 3D.lua which replaces the NextGen* functions with ovtable calls.
wxoverlay.cpp
wxoverlay.h
- The source code for the overlay/ovtable functions in case you wish to do
your own build.
- Should replace the ones in the gui-wx/ folder.
- No changes are required to the build files.
I'd be very interested in any feedback.
Is this path worth continuing?
Should this be added to Golly?
Should we write a native 3D algo in Golly?
Chris
|