[Tcl9-cloverfield] ANNOUNCE: Colibri version 0.9
Status: Alpha
Brought to you by:
fbonnet
From: Frédéric B. <fre...@fr...> - 2011-03-10 10:59:28
|
ANNOUNCE: Colibri version 0.9 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the ninth version of Colibri. CHANGES SINCE VERSION 0.8 ========================= 1. Improved threading model - previous versions used a strict appartment model: each thread had its own memory pools & GC process. Consequently data could not be shared across threads, which is suboptimal when using immutable structures (especially strings and ropes). Individual threads can now choose which model to follow at initialization time. - COL_SINGLE model behaves as previous versions: strict appartment, synchronous GC, no data sharing, deterministic behavior. - COL_ASYNC model resembles COL_SINGLE but the GC process uses its own thread for asynchronous processing. This model is well suited for single-threaded, non-deterministic applications, such as I/O intensive, even-driven or interactive processes, where the GC can be performed as a backgroud task without blocking the main thread (as long as it doesn't need to access the managed data). - COL_SHARED model allows several threads to share data. There can be several distinct "groups" of threads in the same process, each having shared memory pools and a single background GC thread. For example, COL_SHARED and COL_SHARED+1 define two distinct shared groups. Immutable data can be shared safely within a group without specific handling, however mutable data needs explicit, external synchronization depending on the application's needs. Contention is minimal: memory allocation is thread-local and as fast as with single-threaded models, older structures are merged in the shared pool during the GC process. Blocking only occurs when a GC is ongoing. Benchmarks show that this model scales very well. 2. Real-world usability and transition path to Tcl9 - the feature list is now sufficient for real-world applications regarding memory data storage - this includes Tcl itself: Colibri now provides data structures and threading models that I believe can cover the needs of the Tcl core or any of its reimplementations WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 32-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate an arbitrary number of cells if needed * overhead is small: 2 bits per 16-byte cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and cell mutations be declared by the application, using a simple API * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 62kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. 64-bit support 2. Improvements on custom types: custom hash maps, generic map access, immutable map types 3. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use an Ubuntu Karmic Koala 9.10 system for Linux development, so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the machine word size (a cell should be able to store 4 words, which add up to 16 bytes on 32-bit systems). The only parts that need platform-specific code are low-level page allocation and GC-related synchronization. Colibri needs system calls that allocate boundary-aligned pages, as well as synchronization primitives such as mutexes and condition variables. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. However I believe that this conversion is possible starting at version 0.9, as all the necessary building blocks are now present (data structures, threading model...). WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.9/colibri0.9.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.9/colibri0.9.win32.zip/download LICENSE ======= The license is the same as for Tcl. |