|bitlife.sourceforge.net||2015-03-04||cben||[r20] Got rid of independetly managed bitlife.sourcef...|
|python||2015-03-05||cben||[r22] remove debug prints|
|xscreensaver||2015-03-10||cben||[r23] Prominently mention Smalltalk prior art.|
|README.rst||2015-03-11||cben||[r24] README: elaborate a bit more on algorithm goals.|
|screen2d-1.png||2004-02-05||beni||[r4] Version 0.9.2.|
|screen3d-1.png||2004-02-05||beni||[r4] Version 0.9.2.|
This project is practically unmaintained.
Turns out my idea not original, even remotely. Smalltalk-80 documented exactly such approach as part of showing off the BitBlt primitive they'd invented. [Blue Book pages 412–413, Byte Magazine volume 06 number 08 (1981) pages 190,192] In other words, this has been done as soon as it became possible!
It's a simple idea: implement Conway's Game of Life using boolean logic operations (and, or, not, xor); do so by BitBlt operations, thus running a Life game in every bitplane in parallel.
2 implementations are included:
2D python version screenshot:
3D python version screenshot:
Life runs on (ideally) infinite rectangular grid. A generation is defined by the value (alive or dead, 0 or 1) of each cell of the grid. Given a generation n the rules tell you how to produce the next generation n+1.
The state of a given cell in generation n+1 depends only on the states of this cell and it's 8 neighbors in generation n. More precisely it only depends on it's own previous state and the number of neighbors that were alive:
The next state of a cell is a function of 9 bits. Doing boolean operations on 2-D arrays of bits is supported by most graphics libraries (they call it blitting) and is very optimized (frequently it is accelerated in hardware). The operation is done in parallel on all pixels: corresponding pixels in two images are combined according to some boolean operation. If I take an image representing a generation and also take 8 views on it that are offset by one pixel in each direction (easily supported by all graphics platforms), I can then combine them by blits to generate the image of the next generation.
This is interesting because it is very fast and can even be done mostly on the graphic co-processors without loading the CPU. This is very good for a screensaver.
UPDATE: I wrote this project in hope to achieve hardware acceleration but I'm not sure I ever did! The Python version definitely does the operations on CPU (using Numpy; AFAIK Pygame/SDL don't expose boolean blits at all). The C version uses old X API that might or might not be accelerated. See X Hardware acceleration?. Anyway, if you care about GPU-accelerated game of life, there are better approaches and implementations.
Blits are actually typically done on 3-D arrays, where the third dimension is the depth of the image - the 8, 16, 24, etc. bits that one needs to represent each pixel's color. The high bit of a pixel in one image is combined with the high bit of the corresponding pixel in the second image, etc. Thus if I only do blits, no information is exchanged between the bitplanes. Parallel Life games will run in each bitplane. [As an optimization this is known as bit slicing.]
This gives an interesting visual effect: many life games superimposed translucently, each with a different color/brightness (recall that bit k in a binary number contributes (2 to power k) to the number's value, linearly and independently of the other bits). On a True-Color display you will see 8 red lives, 8 blue ones and 8 green ones. Actually you will have hard time seeing all but the ~3 higher bits of each color (unless they stop changing, then the lower planes become visible) because the power-of-2 law results in very strong masking by the highest bits...
If you don't have a TrueColor display, it can be simulated (and even very flexibly, e.g. with other colors, slower than powers-of-2 value decay, etc.) by suitably changing the colormap (a.k.a. palette). So even on TrueColor displays, choosing a DirectColor visual might be better... Note: currently, the colors of the original image are distorted when applying the pallete change.
The Python version requires the NumPy and Pygame libraries. It was originally written to run on any Python >=2.2 though I haven't recently tested old versions. It works on Python 3 (if you can install Pygame on it).
The pygame2d.py, pygame3d.py files should run out of the box (bitlife.py is the module used by them). See source for more options and keys (in particular try the arrows with the 3d version; 'q' for exit).
The C version is intended to integrate into xscreensaver (and hopefully will in the future). UPDATE: It's many years out of date. I've never even told jwz about its existence, and the code is over-engineered not really suitable for inclusion..
You first need to get xscreensaver source distribution. This version of bitlife compiles cleanly against 4.23.
Assuming you unpacked both under your home directory, resulting in ~/bitlife-0.9.6 and ~/xscreensaver-4.23:
cd ~/xscreensaver-4.23/ patch -p 1 < ~/bitlife-0.9.6/xscreensaver/patch cp -rv ~/bitlife-0.9.6/xscreensaver/hacks/* hacks/ ./configure make -C hacks bitlife # check our part early hacks/bitlife # test it... make # compile everything else sudo make install # will install all of xscreensaver!
If the Makefile.in patch doesn't apply cleanly, you can do it by hand (it's very simple); you can also make the changes directly to Makefile if you prefer.
If you want to debug (or understand the workings), add -DDEBUG to CFLAGS at top of Makefile. It will be particularly verbose about colormap manipualtion; when running it will denote blit operations by somewhat cryptic one-two letters - read the code (search for dprintf) to understand it.
I wrote the C version it hoping it will get hardware accelerated. Its workhorse is XCopyArea() with GXxor, GXand and similar modes, which runs on the X server and could be done in hardware with a 2D-accelerated driver. Unfortunately 2D acceleration of old X11 APIs was a big mess (it went through many architecture changes - XAA, EXA etc.) and I never figured whether it is in fact accelerated on any system :-(
Later work on X acceleration shifted to OpenGL (not sure if it supports bitwise operations) and XRender compositing (which doesn't - what it calls "xor" is a continuous function, not bitwise xor). XCopyArea seems half forgotten; trying to google for whether it is accelerated mostly gives results older than 2005...
Random example: [nouveau] XCopyArea with GXand approx 1000 times slower than GXcopy.
See xscreensaver/hacks/bitlife.c source for extensive discussion of technical choices (many in retrospect misguided) and future plans (which I don't expect to touch).
Anyway, if you care about GPU-accelerated game of life, the sensible approach nowdays would be at least OpenGL and better yet more direct programming of the GPU using CUDA/OpenCL.
Before we start, I must note that Hashlife is exponentially faster than anything I present here.
[C notation: & is and, | is or, ^ is xor, a&=b means take a&b and store back into a...]
As I said I want a way to compute the 9->1 bit function (9 = this cell + 8 neighbors):
|sum of neighbors||0||1||2||3||4+|
|this cell = 0||0||1||0|
|this cell = 1||0||1||0|
using boolean operators. So to first approximation I want the smallest boolean circuit. More precisely, BitBlts affect an existing bitmap, so I want a short program consisting of "2-argument opcodes" (like A &= B, not C = A & B).
I also prefer a small number of intermediate bitmaps. [I orinally believed this is important if we have hardware blits and hope to fit all the arrays into video memory. With today's copious memory, I suppose it's a moot concern.]
The costly operation is counting the neighbors (done 8 times). Note that it's enough to saturate the counting at 4 (all values >=4 have same result). So the counters need 5 distinct states.
The brute-force approach is to count in binary. We need 3 bits for the counter itself. Then we need at least one bit for the carry (we can get away with computing the second carry directly into bit 2; it helps to store bit 2 reversed):
carry1 = input carry1 &= bit0 bit0 ^= input carry1 &= bit2not # protect bit 2 once it is "on" (0) bit1 ^= carry1 bit2not ^= bit1 bit2not ^= carry1
Thus we would need 7 operations.
The simpler approach I implemented is a shift register of 4 bits that starts full of zeros and a 1 is shifted in every time the input is 1. It's easy to see that it has 5 states, locking into the last one. The interesting part is that the shifting process can be done without extra bits if performed starting from the MSB:
bit[i+1] |= input bit[i+1] &= bit[i]
The LSB doesn't need the and, so we need 7 operations again. I chose it since it is simpler to think about, uses only two operations and satisfies a nice property: bit[i+1] => bit[i].
Some additional simplifications can be applied to both methods, notably instead of clearing the arrays each iteration, one replaces the operations involving them -- for example or-ing onto something that is intended to be zero, just copy into it...
The search for faster ways is not over. In particular I didn't try blits from the life array onto itself (this might be able to parallelize things that are impossible under the counting approach).
Another possible direction is using lookup table (LUT) operation as the main "engine", rather than blits with logical operation. If you combine / interleave bits of different operands into one image, you can effectively compute logic functions of serveral operands by a LUT. Particularly, the counting could be done in one lookup! I'm not sure what platforms / libraries provide access to accelerated LUT operations, however.
The xsreensaver hack (everything under xscreensaver/) is distributed under same X11-like licence as rest of xscreensaver:
Permission to use, copy, modify, distribute, and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. No representations are made about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty.
Files in this directory (presently - this README document and screenshots) are dual-licensed under both LGPL and the X11-like license: you can use them under either of the licenses at your choice.
License change: the Python part is now under the GNU Lesser General Public License (LGPL).
It was previuosly under the GNU General Public License (GPL). This change can be only good for users as LGPL is more relaxed. I changed it because:
I also clarified the license section to say which directories have which license and made this common document dual-licensed.