tcl9-cloverfield Mailing List for Tcl9
Status: Alpha
Brought to you by:
fbonnet
You can subscribe to this list here.
2008 |
Jan
|
Feb
(3) |
Mar
(9) |
Apr
(22) |
May
(64) |
Jun
(13) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2009 |
Jan
(12) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(2) |
Nov
|
Dec
(1) |
2011 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(5) |
Dec
(1) |
From: Frédéric B. <fre...@fr...> - 2011-12-02 16:16:25
|
ANNOUNCE: Colibri version 0.13 ============================== As part of my ongoing work on the Cloverfield project, I am pleased to announce the thirteenth version of Colibri. CHANGES SINCE VERSION 0.12 ========================== 1. Bugfix release: there was a deadlock in the Unix version where a pthread mutex was entered recursively, which is forbidden. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes on 32-bit systems, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, floating points, single chars and small strings (up to 3 chars on 32-bit and 7 on 64-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 24-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 4-word cells (16 bytes on 32-bit, 32 on 64-bit systems) fit most needs, but elements can allocatean arbitrary number of cells if needed * overhead is small: 2 bits per cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) be declared by the application, using a simple API * mutations are automatically detected thanks to page write tracking * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 67kB. The source code is heavily commented using the very readable Natural Docs syntax. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Improvements on map types: custom key types 2. Use word synonym chains instead of strings as the basic type in Picolibri PORTABILITY =========== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use Ubuntu Karmic Koala 9.10 for Linux 32-bit development (Ubuntu 10.10 for 64-bit) so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit and 64-bit systems: all public and internal types are based on portable C-99 types such as intptr_t. The only parts that need platform-specific code are low-level page allocation, memory protection and related exception/signal handling, and GC-related synchronization. Colibri needs system calls that allocate boundary-aligned pages, as well as synchronization primitives such as mutexes and condition variables. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. However I believe that this conversion is possible starting at version 0.9, as all the necessary building blocks are now present (data structures, threading model...). A quick port of the Picol miminimalist Tcl implementation has been done with success (see here http://wiki.tcl.tk/17893 ). At present it uses a string-based model and makes systematic string conversions but I plan to convert it to the Tcl_Obj-like word synonym model. I also intend to experiment integration with other C-based Tcl implementations such as Jim in the near future. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.13/colibri0.13.src.zip/download - documentation: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.13/colibri0.13.doc.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.13/colibri0.13.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Steve B. <st...@wo...> - 2011-11-30 22:37:26
|
On 30/11/2011, at 11:34 PM, Frédéric Bonnet wrote: > ANNOUNCE: Colibri version 0.12 > ============================== > As part of my ongoing work on the Cloverfield project, I am pleased to announce > the twelveth version of Colibri. > > [snip] > A quick port of the Picol miminimalist Tcl implementation has been done with > success (see here http://wiki.tcl.tk/17893 ). At present it uses a string-based > model and makes systematic string conversions but I plan to convert it to the > Tcl_Obj-like word synonym model. I also intend to experiment integration with > other C-based Tcl implementations such as Jim in the near future. I would be quite interested to see how a port of Jim Tcl to Colibri looks in terms of size, speed and ease of interfacing with applications and extensions. Cheers, Steve -- µWeb: Embedded Web Framework - http://uweb.workware.net.au/ WorkWare Systems Pty Ltd W: www.workware.net.au P: +61 434 921 300 E: st...@wo... F: +61 7 3391 6002 |
From: Frédéric B. <fre...@fr...> - 2011-11-30 16:53:11
|
Hi Larry, Le 30/11/2011 15:59, Larry McVoy a écrit : > Strangely enough, we're playing similar games with write protection. Can > you tell me > > - which platforms have you tested this on? Linux/x86 and Windows/x86. It compiles on Linux/x86-64 but the linker spits a strange error on a missing symbol (oddly enough it compiled just fine on version 0.11), so I haven't had the opportunity to test it. I have no Win64 toolchain but given the overall portability of the code it should work OK (platform-specific code is minimal). As for Unix flavours, as long as you have mmap, pthreads and sigaction it should work fine, 32 or 64 bits. > - Is there a regression suite that tests the protection stuff? If so, > a readme on how to run it and we'll see if we can test it on our > cluster (all of the above and more) Yes, there are both a regression test suite and a benchmark suite. The latter is mostly for data structure comparison, so we can ignore it for now. The test suite is in the /Test dir, there is a series of .c for each test group plus a main test.c file. By default it runs the whole Colibri test suite testColibri(), but you can comment it out and run individual tests. Additionnally, there is a group of individual stress tests at the end of TestMain() that aren't by nature integrated in the regression test suite and can run independently. The ones we want are testChildVect() and testChildren(). The former allocates mutable elements and links them so as to buid a chain in reverse order (older vectors point to newer ones). The latter does more or less the same thing but with "word synonym chains" (in short, a circular linked list of related objects). Both hit memory heavily and make an intense use of parent tracking. To run these tests, simply build the whole stuff with the provided makefiles, run the regression test once to check that everything is OK, then modify test.c to run the tests you want. You can also tweak the constants in the tests if you want to adjust the memory usage, but defaults are fine. > - did you see any perf gain or loss? On regular usage (mutable hash and trie maps, both string- and integer-based), my benchmarks show a speedup of up to 25% on write access compared to the previous version 0.11 where mutable objects and client code had to explicitly call Col_WordSetModified() upon each mutable operation (updating the parent tracking structures). Read access is mostly unchanged. But this doesn't take into account the GC process. For that, the stress tests mentioned above are interesting because they are designed to spend most of their time in GC. Overall I've observed a performance penalty of about 10%, but: - In real world applications, GC run much less often than actual read/write access code, so this penalty is more than compensated by the speedup in write access. - This penalty is mostly due not to the page write tracking itself but to the changes in internal structures: I've made tests with manual vs. automatic write tracking using the same internals (calling the same book-keeping code respectively from client code and exception/signal handler), and found the latter to be slightly faster, to my surprise: I thought at first that exception/signal handling would induce a lot of lost CPU cycles, but since it only occurs on the first write, the cost seems amortized compared to the inline overhead of calling Col_WordSetModified() explicitly. > Cool stuff! Thanks! Feel free to steal the code, it's BSD-licensed (the stuff you want is in platform/unix/colUnixPlatform.c and plaform/win32colWin32Platform.c for platform-specific stuff, and colAlloc.c for memory management) Cheers, Fred |
From: Larry M. <lm...@bi...> - 2011-11-30 15:00:07
|
On Wed, Nov 30, 2011 at 02:34:45PM +0100, Fr?d?ric Bonnet wrote: > 1. Implemented page write protection-based parent tracking > > This is a major internal change in the generational GC. In previous > versions, parents (i.e. cells having children in younger generations) had > to be tracked explicitly by client code calls to Col_WordSetModified each > time a mutable cell was modified. This caused a significant performance > penalty and added complexity in client code. Now parents are automatically > tracked thanks to memory page write protection: pages of older generations > are write-protected, and write attempts are caught (using exceptions on > Windows and SIGSEGV signal handling on Unix). When such a page write is > detected, the write protection is removed and the page marked as modified, > and the client code resumes. All this happens transparently, so that client > code no longer have to track modifications. Upon GC, cells belonging to > modified pages are followed as potential parents so that younger reachable > cells are marked as well. Strangely enough, we're playing similar games with write protection. Can you tell me - which platforms have you tested this on? [ ] linux/x86 [ ] linux/x86_64 [ ] macos/x86 [ ] windows/x86 2000? XP? 2003server? 2008server? Win7? [ ] HPUX/ia64 [ ] AIX [ ] NET/Free/OpenBSD - Is there a regression suite that tests the protection stuff? If so, a readme on how to run it and we'll see if we can test it on our cluster (all of the above and more) - did you see any perf gain or loss? Cool stuff! |
From: Frédéric B. <fre...@fr...> - 2011-11-30 14:04:33
|
ANNOUNCE: Picolibri version 0.1 =============================== As part of my ongoing work on the Cloverfield project, I am pleased to announce the first version of Picolibri, a port of Picol 0.1.22 to the Colibri library (currently at version 0.12) WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT IS PICOL? ============== Picolibri started as a minimalist Tcl interpreter written by Salvatore Sanfilippo, written in 550 lines and code. It was further enhanced by Richard Suchenwirth to include modern Tcl features such as apply. Wiki page: http://wiki.tcl.tk/Picol " SS: Picol (http://antirez.com/page/picol) is a Tcl interpreter in 500 lines of C code! If you like it vote my entry on reddit here http://programming.reddit.com/info/1aft9/comments (the reddit url changed because of a reddit bug). RS Since 2007-04-01, I'm playing with Picol at home - it's been a long while that I prefer to do a breakfast fun project in C :) Basically I'm adding features to bring it closer to Tcl (7.x - everything is a string really). Linecount is at about 1700 now. Latest additions focused on 8.5 features like apply (see below), {*}, in/ni etc. DISCLAIMER: This is an educational toy model only, it is weaker and slower that real Tcl. I'm not planning to add regexp, Unicode, full expr parsing etc... But it's great for experimenting with C, I've had weeks of fun with it. " See also the provided README.txt for complete details about Salvatore & Richard work. WHAT IS COLIBRI? ================ Wiki page: http://wiki.tcl.tk/Colibri (outdated) SF.net project page: https://sourceforge.net/projects/tcl9/files/colibri/ " Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" - Fast and efficient cell-based allocator - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector " WHAT IS PICOLIBRI? ================== Picolibri is a straight port of Picol (in its current version 0.1.22) to the Colibri library (version 0.12), replacing plain C strings with Colibri ropes, and malloc'd memory management with automatic garbage collection. The modified source is about 50 lines longer than the original (1735 vs 1701 for picol.c, and 147 vs. 127 for picol.h). Most changes involve replacing C calls such as strlen by their Colibri counterparts, mostly through macros for readability, and also providing string to Colibri structure conversions (ropes, lists, maps...). Picolibri is mostly an experiment to demonstrate and evaluate the usability of Colibri in real word applications, and especially in the context of interpreted languages. It greatly helped identify weaknesses and lacks of feature in the Colibri interface (especially regarding high level string handling). However it is quite usable as such (and no less usable than the original Picol); it provides a fairly complete Tcl interpreter in a very small footprint and with minimal code: source code is about 1900 lines of C code in two files, and the resulting Windows binary is 57 KB along with the 70 KB Colibri DLL. While not production quality, it proved a serious enough educational toy to play & experiment with. WHERE CAN IT BE FOUND? ====================== The modified Picol source is part of the Colibri source distribution here: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.12/colibri0.12.src.zip/download A pre-compiled Windows library is available here, it includes the command-line executable along with Colibri DLL and the original Picol scripts: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.12/picolibri0.1.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2011-11-30 13:34:59
|
ANNOUNCE: Colibri version 0.12 ============================== As part of my ongoing work on the Cloverfield project, I am pleased to announce the twelveth version of Colibri. CHANGES SINCE VERSION 0.11 ========================== 1. Implemented page write protection-based parent tracking This is a major internal change in the generational GC. In previous versions, parents (i.e. cells having children in younger generations) had to be tracked explicitly by client code calls to Col_WordSetModified each time a mutable cell was modified. This caused a significant performance penalty and added complexity in client code. Now parents are automatically tracked thanks to memory page write protection: pages of older generations are write-protected, and write attempts are caught (using exceptions on Windows and SIGSEGV signal handling on Unix). When such a page write is detected, the write protection is removed and the page marked as modified, and the client code resumes. All this happens transparently, so that client code no longer have to track modifications. Upon GC, cells belonging to modified pages are followed as potential parents so that younger reachable cells are marked as well. As a result, client code is much simpler to write (this is notable for all mutable structures such as maps or vectors) and less error prone (failure to track parents can result in catastrophic memory corruption), but runs faster too due to the absence of book-keeping on the client side. Some benchmarks show a speedup of 25%. 2. Added and improved high level rope functions: - Col_CompareRopes now have several versions for simpler usage (similar to strcmp/strncmp) - Col_RopeFind (similar to strchr/strrchr) - Col_RopeSearch (similar to strstr) 3. Added Col_NewChar for easier single-char strings. Such strings use immutable words and thus take up no allocated space. 4. Improved iterator & traversal interfaces. Ropes, maps and lists can now be iterated and traversed in reverse. 5. Some bug fixes 6. Now includes a port of Picol, a minimalist Tcl implementation (see Wiki page here http://wiki.tcl.tk/17893 ). Read the separate announcement for "Picolibri" WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes on 32-bit systems, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, floating points, single chars and small strings (up to 3 chars on 32-bit and 7 on 64-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 24-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 4-word cells (16 bytes on 32-bit, 32 on 64-bit systems) fit most needs, but elements can allocatean arbitrary number of cells if needed * overhead is small: 2 bits per cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) be declared by the application, using a simple API * mutations are automatically detected thanks to page write tracking * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 67kB. The source code is heavily commented using the very readable Natural Docs syntax. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Improvements on map types: custom key types 2. Use word synonym chains instead of strings as the basic type in Picolibri PORTABILITY =========== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use Ubuntu Karmic Koala 9.10 for Linux 32-bit development (Ubuntu 10.10 for 64-bit) so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit and 64-bit systems: all public and internal types are based on portable C-99 types such as intptr_t. The only parts that need platform-specific code are low-level page allocation, memory protection and related exception/signal handling, and GC-related synchronization. Colibri needs system calls that allocate boundary-aligned pages, as well as synchronization primitives such as mutexes and condition variables. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. However I believe that this conversion is possible starting at version 0.9, as all the necessary building blocks are now present (data structures, threading model...). A quick port of the Picol miminimalist Tcl implementation has been done with success (see here http://wiki.tcl.tk/17893 ). At present it uses a string-based model and makes systematic string conversions but I plan to convert it to the Tcl_Obj-like word synonym model. I also intend to experiment integration with other C-based Tcl implementations such as Jim in the near future. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.12/colibri0.12.src.zip/download - documentation: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.12/colibri0.12.doc.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.12/colibri0.12.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2011-06-23 21:36:59
|
ANNOUNCE: Colibri version 0.11 ============================== As part of my ongoing work on the Cloverfield project, I am pleased to announce the eleventh version of Colibri. CHANGES SINCE VERSION 0.10 ========================== 1. Some internal changes: - Reworked cyclic list handling. Circular lists are now immediate words that wrap regular lists; cyclic lists are built by concatenating a regular and a circular list. - Some immediate word binary tag changes to make room for floating points. 2. Added floating point word type, in both immediate and regular versions. 3. Added immutable internal map structures (both hash and trie maps); maps are still mutable datatypes but can now share data with copy-on-write semantics. 4. Improved trie map access: tries can now be iterated in reverse and use key prefix matching. 5. Removed COL_CHAR type, single-char words now use COL_STRING. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes on 32-bit systems, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, floating points, single chars and small strings (up to 3 chars on 32-bit and 7 on 64-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 24-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 4-word cells (16 bytes on 32-bit, 32 on 64-bit systems) fit most needs, but elements can allocatean arbitrary number of cells if needed * overhead is small: 2 bits per cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and cell mutations be declared by the application, using a simple API * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 67kB. The source code is heavily commented using the very readable Natural Docs syntax. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Improvements on map types: custom key types PORTABILITY =========== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use Ubuntu Karmic Koala 9.10 for Linux 32-bit development (Ubuntu 10.10 for 64-bit) so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit and 64-bit systems: all public and internal types are based on portable C-99 types such as intptr_t. The only parts that need platform-specific code are low-level page allocation and GC-related synchronization. Colibri needs system calls that allocate boundary-aligned pages, as well as synchronization primitives such as mutexes and condition variables. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. However I believe that this conversion is possible starting at version 0.9, as all the necessary building blocks are now present (data structures, threading model...). I intend to experiment integration with other C-based Tcl implementations such as Jim or Picol in the near future. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.11/colibri0.11.src.zip/download - documentation: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.11/colibri0.11.doc.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.11/colibri0.11.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2011-04-19 09:44:04
|
ANNOUNCE: Colibri version 0.10 ============================== As part of my ongoing work on the Cloverfield project, I am pleased to announce the tenth version of Colibri. CHANGES SINCE VERSION 0.9 ========================= 1. Added 64-bit support on Linux - port was rather straightforward thanks to previous efforts on portability (especially through the use of C-99 style typedefs such as intptr_t) - untested on Windows 64 but should work unchanged (system-specific code only uses portable types and calls) 2. Public & Internal documentation! - documentation is automatically generated from source files thanks to the fantastic software Natural Docs: http://www.naturaldocs.org/ - high quality, cross-linked HTML documentation with full indexes and tooltips for both public API and internal code WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes on 32-bit systems, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit and 7 on 64-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 24-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 4-word cells (16 bytes on 32-bit, 32 on 64-bit systems) fit most needs, but elements can allocatean arbitrary number of cells if needed * overhead is small: 2 bits per cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and cell mutations be declared by the application, using a simple API * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 62kB. The source code is heavily commented using the very readable Natural Docs syntax. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Improvements on custom types: custom hash maps, generic map access 2. Immutable map types PORTABILITY =========== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use Ubuntu Karmic Koala 9.10 for Linux 32-bit development (Ubuntu 10.10 for 64-bit) so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit and 64-bit systems: all public and internal types are based on portable C-99 types such as intptr_t. The only parts that need platform-specific code are low-level page allocation and GC-related synchronization. Colibri needs system calls that allocate boundary-aligned pages, as well as synchronization primitives such as mutexes and condition variables. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. However I believe that this conversion is possible starting at version 0.9, as all the necessary building blocks are now present (data structures, threading model...). I intend to experiment integration with other C-based Tcl implementations such as Jim or Picol in the near future. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.10/colibri0.10.src.zip/download - documentation: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.10/colibri0.10.doc.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.10/colibri0.10.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2011-03-10 10:59:28
|
ANNOUNCE: Colibri version 0.9 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the ninth version of Colibri. CHANGES SINCE VERSION 0.8 ========================= 1. Improved threading model - previous versions used a strict appartment model: each thread had its own memory pools & GC process. Consequently data could not be shared across threads, which is suboptimal when using immutable structures (especially strings and ropes). Individual threads can now choose which model to follow at initialization time. - COL_SINGLE model behaves as previous versions: strict appartment, synchronous GC, no data sharing, deterministic behavior. - COL_ASYNC model resembles COL_SINGLE but the GC process uses its own thread for asynchronous processing. This model is well suited for single-threaded, non-deterministic applications, such as I/O intensive, even-driven or interactive processes, where the GC can be performed as a backgroud task without blocking the main thread (as long as it doesn't need to access the managed data). - COL_SHARED model allows several threads to share data. There can be several distinct "groups" of threads in the same process, each having shared memory pools and a single background GC thread. For example, COL_SHARED and COL_SHARED+1 define two distinct shared groups. Immutable data can be shared safely within a group without specific handling, however mutable data needs explicit, external synchronization depending on the application's needs. Contention is minimal: memory allocation is thread-local and as fast as with single-threaded models, older structures are merged in the shared pool during the GC process. Blocking only occurs when a GC is ongoing. Benchmarks show that this model scales very well. 2. Real-world usability and transition path to Tcl9 - the feature list is now sufficient for real-world applications regarding memory data storage - this includes Tcl itself: Colibri now provides data structures and threading models that I believe can cover the needs of the Tcl core or any of its reimplementations WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 32-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate an arbitrary number of cells if needed * overhead is small: 2 bits per 16-byte cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and cell mutations be declared by the application, using a simple API * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 62kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. 64-bit support 2. Improvements on custom types: custom hash maps, generic map access, immutable map types 3. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use an Ubuntu Karmic Koala 9.10 system for Linux development, so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the machine word size (a cell should be able to store 4 words, which add up to 16 bytes on 32-bit systems). The only parts that need platform-specific code are low-level page allocation and GC-related synchronization. Colibri needs system calls that allocate boundary-aligned pages, as well as synchronization primitives such as mutexes and condition variables. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. However I believe that this conversion is possible starting at version 0.9, as all the necessary building blocks are now present (data structures, threading model...). WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.9/colibri0.9.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.9/colibri0.9.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2011-02-21 08:38:55
|
ANNOUNCE: Colibri version 0.8 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the eighth version of Colibri. CHANGES SINCE VERSION 0.7 ========================= 1. Major rework of internals - merged ropes and words: ropes are now regular words - as a consequence, raw C string are no longer valid ropes (however conversion procs are provided) - removed old string word types, ropes can now take advantage of words' "immediate" representation (e.g. a 3-byte string takes up no allocated space and is stored within the pointer value) - custom rope types are now custom words - completely reworked parent handling (i.e. cells of older generations having children in younger generations). The previous implementation used one cell per parent, now parents are handled on a per-page basis. This paves the way for transparent support of parent detection through page guards 2. Rework of word synonym handling - immutable word types no longer have a synonym field internally, but use a special wrapper word when needed. The goal is to distinguish immutable vs. mutable data for better memory management (and possibly memory protection and inter-thread sharing in the future) 3. Rope and list traversal improvement - an arbitrary number of ropes or lists can be traversed in parallel, à la Tcl's [foreach]. This is useful for e.g. string comparisons 4. String-keyed map types - generic interface - two implementations: * hash map: similar to Tcl hash tables, unordered collection with O(1) access, but O(k) hash computation (k being the key length) * crit-bit trie: ordered collection, binary tree with O(log(n)) access, but better mean access than hash maps with long key strings 5. Map iterator interface - generic interface for map types with the same key type (integer or string) WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is an abstract data type infrastructure. It features: - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as ropes (binary tree-based strings, see http://wiki.tcl.tk/20690), vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer- and string-keyed maps, along with easy to use iterator interfaces * ropes are immutable strings that allow for fast operations (extraction, insertion, deletion, concatenation) with maximized data sharing * string types support the full Unicode range (up to 32-bit codepoints) in 1/2/4-byte fixed-width encodings as well as variable-width UTF-8, with transparent conversions and access * vectors and lists come in both immutable and mutable forms. The former share the same features as ropes. The latter implement in-place modifications and gracious degradation to immutable forms * custom word types are supported: lazy or procedural string representations, mutable or immutable types, etc. - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate an arbitrary number of cells if needed * overhead is small: 2 bits per 16-byte cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and cell mutations be declared by the application, using a simple API * custom types can define a finalizer that is called at deletion time; one of the applications is to plug in externally managed structures (e.g. using malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs support circular references without memory leaks nor specific handling; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 61kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Improvements on custom types (e.g. custom hash maps) 2. Improvements on the threading model 3. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use an Ubuntu Karmic Koala 9.10 system for Linux development, so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the machine word size (a cell should be able to store 4 words, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.8/colibri0.8.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.8/colibri0.8.win32.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2010-12-20 09:31:44
|
ANNOUNCE: Colibri version 0.7 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the seventh version of Colibri. CHANGES SINCE VERSION 0.6 ========================= 1. Major rework of internals - system page allocator now supports arbitrarily sized pages - data structures are no longer limited in size (previous version were constrained by the page size, 1KB on 32-bit systems) - cells can be pinned, i.e. marked as unmovable when compacting GC is performed - roots no longer wrap the preserved cells but instead are stored in a dedicated trie indexed by the cell address; cells with positive refcount are pinned so that their addres doesn't change - parents (i.e. cells from older generations pointing to newer ones) are now managed on a per page basis; this improves the locality of reference during GC, and simplifies declaration: instead of calling Col_DeclareChild with parent and child cells, client code simply calls Col_SetModified on the parent - a lot of performance optimizations 2. Benchmarking framework - based on Tcl: specific tests are exposed as Tcl commands, and stress tests are written as Tcl scripts - used to compare Colibri datatype performances to Tcl, STL and others 3. Integer-keyed map types. - generic interface - two implementations: * hash map: similar to Tcl hash tables, unordered collection with O(1) access * crit-bit trie: ordered collection, binary tree with O(log(n)) access WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as vectors (flat arrays), lists (binary trees of vectors with support for cycles and sparse representation), integer-keyed maps, along with an easy to use iterator interface - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate an arbitrary number of cells if needed * overhead is small: 2 bits per 16-byte cell * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and changes to parent cells must be declared by the application, using a simple API * custom types can define a finalizer; one of the applications is to plug in externally managed structures (e.g. malloc/free) * the GC process is fully controllable (pause/resume) so that the applications don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 50kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. String-keyed maps, and improvement to the generic and integer-keyed map interfaces (e.g. iterators). 2. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use an Ubuntu Karmic Koala 9.10 system for Linux development, so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the machine word size (a cell should be able to store 4 words, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.7/colibri0.7.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.7/colibri0.7.win32.zip/download - Linux binary (x86): http://sourceforge.net/projects/tcl9/files/colibri/colibri0.7/colibri0.7.linux-x86.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2010-10-25 09:50:16
|
ANNOUNCE: Colibri version 0.6 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the sixth version of Colibri. CHANGES SINCE VERSION 0.5 ========================= 1. Added mutable list type. - efficient sparse representation of very large lists thanks to "void lists". - copy on write model, can be mixed with immutable types while preserving respective semantics. - can be "frozen" and downgraded to immutable lists at no cost. 2. Test suite! - follows the xUnit model. - implemented as a set of macros in a small, single C header file. 3. Lots of bug fixes thanks to the test suite. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as vectors (flat arrays) and lists (binary trees of vectors with support for cycles and sparse representation), along with an easy to use iterator interface - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate more cells if needed (up to 63 cells on 32-bit systems) * overhead is small: 2 bits per 16-byte cell. * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and parent-child relations must be declared by the application, using a simple API * custom types can define a finalizer; one of the applications is to plug in externally managed structures (e.g. malloc/free) * the GC process is fully controllable (pause/resume) so that the application don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 50kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Maps (AKA dicts) with string keys. Candidate models include Patricia tries, critbits, ternary search trees, hash tables, array hashes, and hash-indexed sparse lists. Several implementations could be provided that will use the same high level interface (access, iteration...), and could be selected programmatically through a capability-based interface (e.g. lexical sorting, order preservation...). Benchmarks and stress tests will be performed on all models including the native Tcl hash table implementation. 2. Test suite improvements and better code coverage. 3. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also use an Ubuntu Karmic Koala 9.10 system for Linux development, so the archive also includes minimalist GNU makefiles for building with the provided GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the machine word size (a cell should be able to store 4 words, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present both Windows and Unix (Linux) versions are provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source subtree. Platform-specific peculiarities should not impact the overall architecture. Tests have been run extensively during development, both on stability and performance. The source includes a test application that runs the test suite by default. Other performance and stress tests are included; note that some are configured in such a way that they require a system with a large memory size (2GB), so make sure to tune their parameters before running them. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.6/colibri0.6.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.6/colibri0.6.win32.zip/download - Linux binary (x86): http://sourceforge.net/projects/tcl9/files/colibri/colibri0.6/colibri0.6.linux-x86.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2010-10-08 12:45:19
|
ANNOUNCE: Colibri version 0.5 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the fifth version of Colibri. CHANGES SINCE VERSION 0.4 ========================= 1. Removed sequence type. It was awkward, hard to use, and didn't quite fit the purpose. Besides, doing a mutable version was too complicated. Replaced by list type improvements. 2. Improved list type. - support for cyclic lists: each list has a loop length field that, when non-zero, gives the length of the terminating loop. Cyclic loops can also easily be detected during iteration thanks to the iterator index. - support for sparse lists: they can now contain void parts (i.e. whose elements are all nil) of arbitrary lengths at the cost of a single machine word. 3. Added mutable vector type. - can grow and shrink arbitrarily within a reserved size set at creation time - can be "frozen" and downgraded to immutable vectors at no cost. 4. Preliminary work for mutable list type. - will use the new sparse list facility. - will downgrade to immutable form easily. 5. Basic exception handling. - catches critical conditions and validation errors with meaningful message - default handler calls abort(), can be overridden by user proc. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer whenever possible rather than in an allocated structure. Everything is abstracted behind accessors. * several high level datastructures are provided, such as vectors (flat arrays) and lists (binary trees of vectors with support for cycles and sparse representation), along with an easy to use iterator interface - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate more cells if needed (up to 63 cells on 32-bit systems) * overhead is small: 2 bits per 16-byte cell. * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and parent-child relations must be declared by the application, using a simple API * custom types can define a finalizer; one of the applications is to plug in externally managed structures (e.g. malloc/free) * the GC process is fully controllable (pause/resume) so that the application don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 40kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Mutable data types with graceful degradation to immutable forms: - Mutable lists. - Maps (AKA dicts) with string keys. Candidate models include Patricia trees and Ternary Search Trees, but not hash tables. 2. Proper internal and user documentation 3. Better test suite WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also uses an Ubuntu Karmic Koala 9.10 system for Linux development, so the archive also includes minimalist GNU makefiles for building using the included GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the machine word size (a cell should be able to store 4 words, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present both Windows and Unix (Linux) version is provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source tree. Platform-specific peculiarities should not impact the overall architecture. Tests have been run extensively during development, both on stability and performance. However Colibri lacks a real test suite (including unit testings). The source includes a test application that has to be hand-edited to run or customize specific tests. Note that some stress tests are configured in such a way that they require a system with a large memory size (2GB), so make sure to tune their parameters before running them. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.5/colibri0.5.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.5/colibri0.5.win32.zip/download - Linux binary (x86): http://sourceforge.net/projects/tcl9/files/colibri/colibri0.5/colibri0.5.linux-x86.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2010-08-19 13:09:56
|
ANNOUNCE: Colibri version 0.4 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the fourth version of Colibri. CHANGES SINCE VERSION 0.3 ========================= 1. Added sequence and reference word types. - potentially unlimited collections - self-referencing structures with efficient cycle detection 2. Type- and nil/NULL- related cleanup. 3. Improved source code documentation on the internal data structures. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer rather than in an allocated structure * several high level datastructures are provided, such as vectors (flat arrays), lists (binary trees of vectors), and sequences (potentially cyclic and unlimited linked lists of lists and references to other sequences), along with an easy to use iterator interface - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate more cells if needed (up to 63 cells on 32-bit systems) * overhead is small: 2 bits per 16-byte cell. * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and parent-child relations must be declared by the application, using a simple API * custom types can define a finalizer; one of the applications is to plug in externally managed structures (e.g. malloc/free) * the GC process is fully controllable (pause/resume) so that the application don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 39kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. PLANNED FEATURES FOR NEXT VERSIONS ================================== 1. Mutable data types with circular references and graceful degradation to immutable types: - Mutable lists. Candidate models include unrolled linked lists and skip lists. - Maps (AKA dicts) with string keys. Candidate models include Patricia trees and Ternary Search Trees, but not hash tables. 2. Proper internal and user documentation 3. Better test suite WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also uses a CentOS 5.2 Linux VMware image for Linux development, so the archive also includes minimalist GNU makefiles for building using the included GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the word size (a cell should be able to store 4 pointers, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present both Windows and Unix (Linux) version is provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source tree. Platform-specific peculiarities should not impact the overall architecture. Exception handling is poor, especially because Colibri is meant to be part of a larger system that provides the appropriate infrastructure. "Exception handling" includes what to do upon allocation failures due to low-memory conditions. As Colibri uses a non-prehemptive model it cannot stop-the-world and GC like other implementations do to free memory. More generally, due to the nature of the model, such conditions would only occur when the memory is actually used or when the GC is not triggered often enough (for example during a long computation phase involving a large number of temporary allocations), both being unrecoverable anyway without explicit programming. Tests have been run extensively during development, both on stability and performance. However Colibri lacks a real test suite (including unit testings). The source includes a test application that has to be hand-edited to run or customize specific tests. Note that some stress tests are configured in such a way that they require a system with a large memory size (2GB), so make sure to tune their parameters before running them. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: http://sourceforge.net/projects/tcl9/ Mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.4/colibri0.4.src.zip/download - Windows binary: http://sourceforge.net/projects/tcl9/files/colibri/colibri0.4/colibri0.4.win32.zip/download - Linux binary (x86): http://sourceforge.net/projects/tcl9/files/colibri/colibri0.4/colibri0.4.linux-x86.zip/download LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2009-09-22 20:25:48
|
ANNOUNCE: Colibri version 0.3 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the third version of Colibri. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer rather than in an allocated structure - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate more cells if needed (up to 63 cells on 32-bit systems) * overhead is small: 2 bits per 16-byte cell. * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and parent-child relations must be declared by the application, using a simple API * custom types can define a finalizer; one of the applications is to plug in externally managed structures (e.g. malloc/free) * the GC process is fully controllable (pause/resume) so that the application don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 32kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. CHANGES SINCE VERSION 0.2 ========================= 1. Multithreading support - Appartment model, each thread has its own memory pool 2. Linux port and code portability improvements - Now uses C99 types such as int8_t and uintptr_t - Moved platform-specific code to platform/* subtree - Linux version uses mmap and pthreads 3. Test app overhaul - Each test has its own C file - Multithreading support PLANNED FEATURES FOR NEXT VERSION ================================= 1. Mutable data types with circular references and graceful degradation to immutable types: - Mutable lists. Candidate models include unrolled linked lists and skip lists. - Maps (AKA dicts) with string keys. Candidate models include Patricia trees and Ternary Search Trees, but not hash tables. 2. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so the source archive primarily includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. I also uses a CentOS 5.2 Linux VMware image for Linux development, so the archive also includes minimalist GNU makefiles for building using the included GCC compiler. However it makes no use of other parts of the GNU toolchain (autoconf and the like). The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the word size (a cell should be able to store 4 pointers, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present both Windows and Unix (Linux) version is provided, the latter using mmap. Porting to other systems should require only minimal effort, as the platform-specific code is limited to a handful of functions gathered in a platform-specific source tree. Platform-specific peculiarities should not impact the overall architecture. Exception handling is poor, especially because Colibri is meant to be part of a larger system that provides the appropriate infrastructure. "Exception handling" includes what to do upon allocation failures due to low-memory conditions. As Colibri uses a non-prehemptive model it cannot stop-the-world and GC like other implementations do to free memory. More generally, due to the nature of the model, such conditions would only occur when the memory is actually used or when the GC is not triggered often enough (for example during a long computation phase involving a large number of temporary allocations), both being unrecoverable anyway without explicit programming. Tests have been run extensively during development, both on stability and performance. However Colibri lacks a real test suite (including unit testings). The source includes a test application that has to be hand-edited to run or customize specific tests. Note that some tests are configured in such a way that they require a system with a large memory size (2GB), so make sure to tune their parameters before running them. Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: https://sourceforge.net/projects/tcl9/ Mailing list: https://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://downloads.sourceforge.net/tcl9/colibri0.3.src.zip?use_mirror= - Windows binary: http://downloads.sourceforge.net/tcl9/colibri0.3.win32.zip?use_mirror= - Linux binary (x86): http://downloads.sourceforge.net/tcl9/colibri0.3.linux-x86.zip?use_mirror= LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2009-04-16 14:18:25
|
ANNOUNCE: Colibri version 0.2 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the second version of colibri. WHAT IS CLOVERFIELD? ==================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STAND FOR? ============================ Colibris, known in English as hummingbirds, are a family of birds known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renown for their stationary and backward flight abilities on par with flying insects, which allow them to feed on the nectar of plants and flowers in-flight. I've chosen this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI? ================ Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words can have synonyms, i.e. words of any type * synonyms can form chains of arbitrary length, so a word can have several synonyms with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer rather than in an allocated structure - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate more cells if needed (up to 63 cells on 32-bit systems) * overhead is small: 2 bits per 16-byte cell. * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and parent-child relations must be declared by the application, using a simple API * custom types can define a finalizer; one of the applications is to plug in externally managed structures (e.g. malloc/free) * the GC process is fully controllable (pause/resume) so that the application don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often as they get promoted to older generations * promotion is done by moving whole pages between memory pools, optionally performing compaction when fragmentation exceeds a certain threshold, thus limiting the overhead and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks; word synonym chains take advantage of this, being implemented as circular lists * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 32kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATE TO TCL? =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. CHANGES SINCE VERSION 0.1 ========================= 1. Major rewrite of GC code: - Roots are now sorted by generation for better performances with large number of instances. - Reworked parent-child relation handling. Now uses one cell per parent instead of one per parent-child pair, in combination with a parent flag (replaces the now suppressed generation flag). - Roots and parent-child relations now use cells allocated in a dedicated memory pool. - No more per-cell generation flag; all cells in a page belong to the same generation. Promotion is now immediate upon GC. - Promotion is done per page rather than per cell: whole pages are moved up one generation at each GC. CPU overhead is much smaller. - As a consequence, the first generation is always emptied at each GC, which speeds up further allocations. - The last collected generation is by default merged with the first uncollected one, but may be relocated if fragmentation reaches a certain threshold; this is done by copying individual cells instead of moving pages. - Relocation and redirection are now performed inline during the mark phase, versus during later steps in previous versions. The resulting code is much simpler, and avoids cache-intensive traversal of whole memory pools. 2. Improved performances in every domain thanks to point 1. 3. Added two high-level data types: vectors and lists. - Vectors are flat collections of words. Their size is limited, as they must fit within one single page. They are directly addressable. - Lists are linear collections of words, made by assembling vectors into balanced binary trees. They are to vectors what ropes are to strings. Like ropes, lists are immutable. Iterator and traversal APIs are provided. PLANNED FEATURES FOR NEXT VERSION ================================= 1. Mutable data types with circular references and graceful degradation to immutable types: - Mutable lists. Candidate models include unrolled linked lists and skip lists. - Maps (AKA dicts) with string keys. Candidate models include Patricia trees and Ternary Search Trees, but not hash tables. 2. Proper internal and user documentation WHAT NEEDS TO BE DONE? ====================== My main development platform is Windows, so for now the source archive only includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the word size (a cell should be able to store 4 pointers, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present only a Windows version is provided, but porting to other systems should require minimal effort, as the platform-specific code is limited to a handful of functions. Platform-specific peculiarities should not impact the overall architecture. Exception handling is poor, especially because Colibri is meant to be part of a larger system that provides the appropriate infrastructure. The current implementation is not thread-safe as it uses unprotected global structures. However I don't think that adding synchronization is necessary given that it should eventually be integrated into a larger application with the same appartment threading model as Tcl. The main advantage of having a global allocator is inter-thread data sharing, and Tcl typically shares no data between threads. As Colibri is not designed to be a general-purpose solution, and unless there is a real benefit in using a global thread-safe allocator, converting global structures to thread-local should fit our needs. Tests have been run extensively during development, both on stability and performance. However Colibri lacks a real test suite. The source includes a test application that has to be hand-edited to run or customize specific tests. Note that the current values defined by this application require a system with a large memory size (2GB). Last, it lacks user and design documentation, although the source code is extensively commented. GRAND UNIFICATION SCHEME ======================== A great side project would be to reimplement Tcl over Colibri as a replacement for the current Tcl_Obj based code. Of course the resulting library would be incompatible on an ABI level, but this would provide a great testbed for Colibri in real-world cases (using pure script applications) as well as a comparison point with the official Tcl core, and will possibly open the path to Tcl9. This most likely involves a lot of work. WHERE CAN IT BE FOUND? ====================== Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: https://sourceforge.net/projects/tcl9/ Mailing list: https://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://downloads.sourceforge.net/tcl9/colibri0.2.src.zip?use_mirror= - Windows binary: http://downloads.sourceforge.net/tcl9/colibri0.2.win32.zip?use_mirror= LICENSE ======= The license is the same as for Tcl. |
From: Frédéric B. <fre...@fr...> - 2009-02-13 23:49:16
|
ANNOUNCE: Colibri version 0.1 ============================= As part of my ongoing work on the Cloverfield project, I am pleased to announce the first version of Colibri. WHAT IS CLOVERFIELD ? ===================== Wiki page: http://wiki.tcl.tk/Cloverfield WHAT DOES COLIBRI STANDS FOR ? ============================== Colibris, known in English as hummingbirds, are a family of bird known for their small size and high wing speed. The bee hummingbird (Mellisuga helenae), found in Cuba, is the smallest of all birds, with a mass of 1.8 g and a total length of about 5cm. They are renowed for their stationary and backward flight abilities on par with flying insects, which allows them to feed on the nectar of plants and flowers in-flight. I've chose this name for this project because its goal is to be fast and lightweight, and to follow the feather and bird theme shared with Tcl and many related projects. WHAT IS COLIBRI ? ================= Colibri is a string and data type infrastructure. It features: - Rope-based strings (see http://wiki.tcl.tk/20690) : * ropes are immutable ... but ... * extraction, insertion, deletion, concatenation... are fast * limited compatibility with native C strings allocated in static space (or guaranteed to survive the whole application lifetime) * support for the full Unicode range (up to 32-bit codepoints) * 1/2/4-byte fixed-width encodings as well as variable-width UTF-8 with transparent conversions and access * custom rope types allows for lazy or procedural representations * iterator interface for easy access - Extensible data type sytem dubbed "words" * similar to Tcl_Obj * minimum payload is identical to that of Tcl_Obj, i.e. 8 bytes, but can take more space if needed * words have values that can be of any type, including other words * values can form chains of arbitrary length, so a word can have several values with different representations -- no more shimmering! * predefined types such as ints, single chars and small strings (up to 3 chars on 32-bit systems) are represented as immediate values: the value is stored directly in the pointer rather than in an allocated structure - Fast and efficient cell-based allocator * page-based allocation for optimal locality of reference and cache use * 16-byte cells on 32-bit systems fit most needs, but elements can allocate more cells if needed (up to 63 cells on 32-bit systems) * overhead is small: 2 bits per 16-byte cell. * raw alloc performances are competitive with the stock system malloc, and in many cases much faster (up to 5 times as fast on small strings and on words vs. Tcl_Obj-like structs) * single cell allocation (the most frequent case) is very fast - Automatic memory management thanks to an exact (AKA accurate or precise), generational, copying, mark-and-sweep, garbage collector * exact GC implies that roots (externally referenced elements) and parent-child relations must be declared by the application, using a simple API * custom types can define a finalizer * the GC process is fully controllable (pause/resume) so that the application don't get interrupted unexpectedly * generational GC limits the overhead by restricting the collection to newer elements, which are typically short-lived * longer-living elements are collected less often and get promoted after two cycles * promotion is done by copying old elements to older pools, which performs compaction as a side effect, thus limiting the fragmentation and improving the cache-friendliness over time * contrary to reference-counting schemes (RC), GCs allow circular references without memory leaks * studies show that GCs consistently outperform RC in raw performances and real-world cases, because the cost (both space and time) of maintaining reference counters outweights the amortized GC overhead, especially with generational GCs Colibri is written in plain C and is free of any dependency besides system libraries. The compiled binary DLL on Windows is about 27kB. The source code is heavily commented and follows the Tcl quality standards. HOW DOES COLIBRI RELATES TO TCL =============================== From the Cloverfield announcement: " The last major point of the project is related to implementation and low level issues. The existing Tcl implementation is notoriously solid, however some changes are too radical to be performed on the current code base, so a reimplementation is certainly needed on vast portions. For example, the string representations, object structures, and virtual machines will certainly require complete rewrite to accommodate with the needed changes, which means that a lot of code won't be reused. However vast portions of library code and algorithms certainly will (clock scan, bigint, regexp...), as well as the high coding standards and QA that are the trademarks of our beloved language. " So Colibri is a candidate infrastructure for Tcl9 as an alternative to the current Tcl_Obj-based core implementation. I believe that the features provided by Colibri shall yield higher levels of performances than the current architecture, at the price of an ABI incompatibility (for which major versions are made anyway), but with a similar philosophy that should ease conversion of existing code. WHAT NEEDS TO BE DONE ? ======================= My main development platform is Windows, so for now the source archive only includes Microsoft Visual Studio project files. Microsoft provides a free edition of their development environment known as Visual Studio Express for those willing to compile and test the library without having to buy an expensive license. Other systems need a makefile and/or autoconf scripts. The code is fairly portable on 32-bit systems. 64-bit support will need more work because all the internals are fine-tuned and optimized at the bit level; however porting to 64-bit should be rather straightforward: the algorithms will remain unchanged, structure access is abstracted behing macros, and cell size is proportional to the word size (a cell should be able to store 4 pointers, which add up to 16 bytes on 32-bit systems). The only part that needs platform-specific code is the low-level page allocator. Colibri needs system calls that allocate boundary-aligned pages. At present only a Windows version is provided, but porting to other systems should require minimal effort, as the platform-specific code is limited to a handful of functions. Platform-specific peculiarities should not impact the overall architecture. Exception handling is poor, especially because Colibri is meant to be part of a larger system that provides the appropriate infrastructure. The current implementation is not thread-safe as it uses unprotected global structures. However I don't think that adding synchronization is necessary given that it should eventually be integrated into a larger application with the same appartment threading model as Tcl. The main advantage of having a global allocator is inter-thread data sharing, and Tcl typically shares no data between threads. As Colibri is not designed to be a general-purpose solution, and unless there is a real benefit in using a global thread-safe allocator, converting global structures to thread-local should fit our needs. Tests have been run extensively during development, both on stability and performance. However Colibri lacks a real test suite. The source includes a test application that has to be hand-edited to run or customize specific tests. Note that the current values defined by this application require a system with a large memory size (2GB). Last, it lacks user and design documentation, although the source code is extensively commented. WHERE CAN IT BE FOUND ? ======================= Wiki page: http://wiki.tcl.tk/Colibri Project page @ SourceForge.net: https://sourceforge.net/projects/tcl9/ Mailing list: https://sourceforge.net/mailarchive/forum.php?forum_name=tcl9-colibri Direct Download: - source: http://downloads.sourceforge.net/tcl9/colibri0.1.src.zip?use_mirror= - Windows binary: http://downloads.sourceforge.net/tcl9/colibri0.1.win32.zip?use_mirror= LICENSE ======= The license is the same as for Tcl. |
From: Andy G. <unu...@ai...> - 2009-01-21 05:53:10
|
"Andy Goth" <unu...@ai...> wrote: > void process_list(node* head) > { > int index; > node* current = head; > while (1) { > current = get_node_by_index(head, index); > if (current == NULL) { > break; > } > process_node(head); > } > } Oops. Add a ++index; in there, will ya? -- Andy Goth | http://andy.junkdrome.org/ unununium@{aircanopy.net,openverse.com} |
From: Andy G. <unu...@ai...> - 2009-01-20 16:19:37
|
Dammit, I was just about to go to work, and then your email hit my inbox! Next time, wait five more minutes before sending! ;^) "Frédéric Bonnet" <fre...@fr...> wrote: > Large objects can use contiguous malloc'd blocks wrapped around custom > objects, Good idea; I didn't think of that. It's not like your work completely replaces malloc. They can be used together. > or they can be split into smaller nodes and assembled using a > tree (like ropes do with string chunks). That is what I was thinking of. I was just a little worried that this would become cumbersome. > the latter is preferable and would give better performances especially > because of data sharing and limited memory allocation overhead. I doubt it'll be measurable, if my assumption holds true. My assumption is that large blocks will be rare and infrequently reallocated. I'm thinking a couple might be created when the extension library is loaded or initialized. Possibly they won't even be reachable from the Cloverfield memory manager; they can just be referenced by pointers kept in the library's data segment. Internal stuff, you understand. Note that these blocks might contain pointers to stuff that is allocated using your manager, which is just fine. > Regarding mutability, I've dropped refcounted objects in favor of a > mark-and-sweep GC Let me get this straight. In Tcl, the structure only has to be copied on write when it is shared. The copy will not be shared, so it can be modified in-place. After doing the copy, it's possible that the original might cease to be shared. In Cloverfield, nothing can be modified in-place, since everything could potentially be shared. There's no way to prove otherwise. So modifications are always done by creating new objects that are altered copies of the original. To cut down on the overhead of copying, you want to keep the object size as small as possible. Sound right? > > > - traversal is near linear in most case because we can keep > > > contextual info in iterator structures and directly access > > > the leaves that contain actual data > > > > Can you explain this for me? > > Have a look at the iterator code. Sorry, I have forgotten to read your code, and I certainly don't have time now. I don't have time to write this email either, but I'm doing it anyway. ;^) > In short, iterators are > stack-allocated structures that store the current iteration context, > that is, the leaf node and the local index where iteration takes place > plus info about nodes at upper levels. So it won't be necessary to start over all the time. Got it. I've seen this at work with linked lists. node* get_next_node(node* head) { return head->next; } node* get_node_by_index(node* head, int index) { for (; index > 0; --index) { head = get_next_node(head); } return head; } The first function is what you describe, it takes constant time, and it can be used to iterate over a list in linear time. The second function takes linear time, and it can be used to iterate over a list in quadratic time. How can it be used to iterate over a list?, I hear you ask. void process_list(node* head) { int index; node* current = head; while (1) { current = get_node_by_index(head, index); if (current == NULL) { break; } process_node(head); } } Now I hear you ask WTF?!, but I can't answer that question. I wish I could... Seriously, I have seen code like this before. It didn't really matter, since the list was only one or two elements long. But it was gross nevertheless. I would have replaced it with a preallocated array with a reasonable maximum size, such as ten. > - small objects *should* fit within a single cell as much as > possible. So do you have code to automatically split the object across cells? > - large objects *must* be split into smaller ones; Now you have me confused, unless there's a third class of object that you consider to be neither small nor large. So how do you define small and large? Hmm. Later in this email you talk about pages as well as cells. I guess you're trying to say that the preferred object size is at most one cell, and all "physical" objects must be at most one page. The caller can string together "physical" objects into "logical" objects if they need to be really big. > alternatively, one memory mapping (a strength rather than a weakness). What's memory mapping, in this context? > I believe that only very specialized types would be impacted by this > limit, and it's also likely that those would already rely on underlying > libraries that provide their own structures The underlying libraries are likely to use malloc anyway, or perhaps they have their own customized allocator schemes. Some require the caller to manage memory by putting everything on the stack, in the data or BSS segments, or on the heap. In that last case the programmer will still need malloc, but that is fine so long as the programmer is given the opportunity to register a destructor callback to be executed when the object is reaped by the garbage collector. > My design is based on the observation that modern architectures are > neither limited in computation performances nor in memory size, [...] The traditional malloc approach is optimized for ease of programming in C-like language. It makes perfect sense to pick something else that may be harder to use in C when the programmer will be coding in a higher-level scripting language anyway. The goal of any programming language is to make things easy for the programmer, so if the programmer isn't going to be using malloc directly, it's no longer necessary to incur malloc's performance cost. Unload the baggage. ;^) > The current code builds flat strings if the whole data fits into no > more than 3 cells. This is a totally arbitrary value that I chose as a > compromise between memory consumption and data sharing/duplication. Once there is Real Code written in this New Scripting Language, we can graph this threshold versus execution time and memory utilization, then pick whatever threshold performs best. > > set a (0 1 2) > > set b ({*}$a 3 4 5) > > set c {{*}$a 3 4 5} > % puts [lindex $b 0] > 0 > % puts [lindex $c 0] > $a Why is {*} interpreted inside a brace-quoted word? Tcl performs no modifications to the contents of a brace-quoted word; the only processing it does is finding the closing brace. I would prefer to keep things this way, even if it's more complicated now to find the closing brace. Otherwise, won't there be trouble with brace-quoted scripts? It should be possible to get the string representation of a script without losing all the {*}'s inside. > b is expanded at evaluation time and forms a 6-element list. The parse tree for the last word of the second line would say: - construct a list as follows: - expand to multiple elements: current value of variable "a" - delimiter: " " - element: "3" - delimiter: " " - element: "4" - delimiter: " " - element: "5" When executed, this parse tree would construct an object with the following two representations: 1. String representation: rope concatenating the string value of "a" with " 3 4 5" 2. List representation: rope concatenating the list value of "a" with (3 4 5) - Each element is a rope pointing into the same memory as the non-space parts of representation (1) It's necessary to build both representations for a parenthesized word, due to the way you defined parens quoting. Basically it will index the same data in two different ways. It's possible to instead make a unified third representation that looks a lot like the above parse tree, then later build string or list representations from this third representation, as needed. > % set v [string repeat "0123456789" 10000] > 012345678901234567890123456789<... 99940 chars > skipped>012345678901234567890123456789 > > The idea could be extended not only to strings but to containers as > well (lists). So you could apply this concept to errors caused by > stale references. Reminds me of Python's str()/repr() dichotomy. Not necessarily a bad thing. Making them one and the same is a nice goal, but it has limitations. So this [info] subcommand would return a tagged list describing the structure of the object being queried, without necessarily returning its actual string representation. > (Note that Cloverfield avoids this class of problems by design since > references always have a valid string rep and a known value in case > they are unresolved, although this may change if we allow the string > rep generation to fail, as is possible with file backed memory > mapping.) The concern I have about references having a valid string representations is that they can be constructed by accident or as a result of malicious user input. Tcl has duck typing, meaning that an object's "type" is determined by how the code uses the object. But references are automatically dereferenced, not explictly dereferenced, so it's possible to convince otherwise-correct code to do the wrong thing just by feeding it a specially crafted string. My references don't have a string representation, so the "referencehood" is signaled by an unforgeable out-of-band condition. It's a similar deal as null being signaled by variable existence. > > Now for pointer trickery. Thankfully, pointers always have string > > representations, even when they point to variables that don't exist > > yet. ;^) > > So I get that your @ suffix is syntactic sugar for an explicit pointer > dereference. Yeah, but I think it's a bit more than sugar, since it enables the programmer to communicate his intent directly to the variable lookup code without detouring through a command substitution. This should result in much better bytecodes, and it's easier for humans and debuggers to read. It's a major distraction to back up to the beginning of the substitution to insert "[deref", then return to where the pointer actually gets followed and insert "]". A similar problem happens in C; with "*" instead of "deref". So I put the pointer-follow notation at the point where it actually happens! It's like "->" or "[0]" in C. I've actually had to use the latter in macros; it sometimes comes in handy. > > puts $&[whatever]{0} -> 5 > > puts $&(5 4 3 2 1 0){0} -> 5 > > Interesting. This allows the variable indexing syntax without an > actual variables. That's the idea. > Kind of C++ anonymous objects. Where the function returns by reference? This can be done somewhat in C, except the caller has to remember to dereference using * or ->. > This means that you can merge (in the string sense) two > "objects" without caring about their actual string rep. What happens when two lists are merged, versus when two strings are merged? Tcl: % proc string_concat {a b} {return $a$b} % concat {0 1 2} {3 4 5} 0 1 2 3 4 5 % string_concat {0 1 2} {3 4 5} 0 1 23 4 5 By the way, I think a [string concat] command would be very useful in functional contexts. > in the grand OO tradition, ropes define an interface, not a structure. Actually I think that's the same as what I suggest, in that the different representations are unified, not special cases. You're proposing to also unify strings with the other object types. Sounds good to me. -- Andy Goth | http://andy.junkdrome.org/ unununium@{aircanopy.net,openverse.com} |
From: Frédéric B. <fre...@fr...> - 2009-01-20 14:55:14
|
----- "Andy Goth" <unu...@ai...> a écrit : > How will the block size limit impact the extension programmer when > dealing with large objects? I guess it will be possible to subdivide > large objects into smaller objects with some kind of master object > that points to them. And arrays (which I find are the main reason why > objects get to be large) can be handled by this new code you're about > to write. Large objects can use contiguous malloc'd blocks wrapped around custom objects, or they can be split into smaller nodes and assembled using a tree (like ropes do with string chunks). In a purely functional, immutable model, the latter is preferable and would give better performances especially because of data sharing and limited memory allocation overhead. Regarding mutability, as I've dropped refcounted objects in favor of a mark-and-sweep GC, the current Tcl solution (using large mutable objects with copy-on-write) is no longer viable, because there is no way to tell whether an object is shared or not, so you have to copy the whole structure every time it's modified, and you end up with an immutable model. For this reason, I believe that a tree with small leaves would perform better on average than large contiguous objects, where random access is only marginally better but memory allocation is much worse. > > - traversal is near linear in most case because we can keep > contextual > > info in iterator structures and directly access the leaves that > > contain actual data > > Can you explain this for me? Have a look at the iterator code. In short, iterators are stack-allocated structures that store the current iteration context, that is, the leaf node and the local index where iteration takes place plus info about nodes at upper levels. If you iterate over a rope from beginning to end, most operations will then be direct access on the leaf data, with restricted or complete O(logn) tree traversal once in a while. Hence the near-linear complexity. > > memory allocation is actually much faster with ropes than with flat > > lists, which more than compensates for the overhead. > > The primary overhead is additional burden placed on the programmer to > keep everything within size limits. Indeed. There are two places where extension programmers must take care of size: - small objects *should* fit within a single cell as much as possible. While not a strict requirement, it reduces overhead greatly pretty much everywhere as it keeps the heap very compact and allocation is O(1). - large objects *must* be split into smaller ones; alternatively, one can still use wrapped malloc'd blocks, or even memory mapping (a strength rather than a weakness). In real world apps, very few object types would hit the size limit. We will have to provide efficient core types so that extension programmers don't have to build their own. The same way it's done with Tcl lists being used as a generic container. I believe that only very specialized types would be impacted by this limit, and it's also likely that those would already rely on underlying libraries that provide their own structures (matrix calculus for example) in such a way that our ropes/objects only form a wrapping layer around externally managed storage. > The traditional approach is to abstract stuff like this away so that > the programmer doesn't have to worry about it. But the costs don't go > away; they just get hidden and forgotten. Some of the most > interesting designs I've seen arose from reëvaluating the historical > layers of abstraction. Simple example: simplify the CPU by requiring > that things be aligned, use the extra die area for registers or cache > or execution units, and improve the compiler to statically deal with > alignment. Basically relocate complexity from a place where it is > expensive (in the CPU) to a place where it is cheap (in the compiler). My design is based on the observation that modern architectures are neither limited in computation performances nor in memory size, however the remaining limits are memory throughput and cache size. Hence architectures must make optimal use of cache lines and pagination: memory compacity, locality of reference... My design follows this idea by using a cell-based allocator with very limited overhead (2 bits per cell, i.e. one 16-byte cell per 64-cell pages) compared to a traditional allocator (which uses linked lists of allocated and free space), as well as an allocation policy that favors proximity between related cells. Moreover the mark-and-sweep GC beats simple malloc/free in most cases, even without refcounting (which would add even more space and time overhead), and the generational promotion limits the cache misses even further as well as the impact of GC on stable objects, adding an implicit compaction phase in the process. My preliminary benchmarks make me feel confident about the raw performances I'll get with my implementation compared with traditional architectures (including the current Tcl core). > > Regarding Tcl, this means that the string rep of a container can > > share a large part of its elements' and take up much less memory > > than flat strings (besides being faster to allocate and build). > > Will there be heuristics to decide when to join lots of tiny strings > into a new array? If the program concatenates thousands of > single-character strings, will this be efficient with your current > rope code? The current code builds flat strings if the whole data fits into no more than 3 cells. This is a totally arbitrary value that I chose as a compromise between memory consumption and data sharing/duplication. > set a (0 1 2) > set b ({*}$a 3 4 5) > set c {{*}$a 3 4 5} > puts [lindex $b 0] > puts [lindex $c 0] > > What will this print? Pay careful attention to the use of parentheses > versus braces. % puts [lindex $b 0] 0 % puts [lindex $c 0] $a b is expanded at evaluation time and forms a 6-element list. c is a 4-element list because "$a" is not a list, i.e. it's a single-element list. Compare with: % set d {{*}{0 1 2} 3 4 5} % puts [lindex $d 0] 0 > > I don't think multiple levels of expansion is useful > > (not sure what you mean by that). > > set a (((0 1) (2 3)) ((4 5) (6 7))) > set b {*}$a > set c {*}{*}$a > > would set a to ((0 1) (2 3) (4 5) (6 7)) and b to (0 1 2 3 4 5 6 7). > > But I don't think this is worth having. The Tcl approach is nice and > simple. Just recognize it once. OTOH multiple levels is just a side effect of allowing expansion within the string rep, not a feature in itself. Moreover, lists with expanded elements are only likely to occur when generating the string rep of a list having expanded references, because other occurrences would be expanded inline. Using the expansion modifier in braced expressions doesn't make much sense in practice unless the programmer is a sado-masochist with a taste for obfuscation. % set a (0 1) 0 1 % set b ({*}$a 2 3) 0 1 2 3 % set c ({*}$&a 2 3) {*}{ref .someid}{0 1} 2 3 % lindex $c 0 0 % lindex $c 2 2 % lappend a foobar % lindex $c 2 foobar > proc &whatever (a? (b default_value) c d* e?) { > if {[info exists &a]} { > puts a=$a > } > puts b=$b > puts c=$c > puts d=$d > if {[info exists &e]} { > puts e=$e > } > } > whatever c > whatever a c > whatever a b c > whatever a b c e > whatever a b c d1 e > whatever a b c d1 d2 e Makes sense. Arguments are expanded in the natural order. > set &a 1 > set &mydict (alpha &a bravo b charlie &c) > puts $mydict -> error: reference to nonexistent variable "c" Late bound weak references. > puts $mydict(alpha) -> 1 > puts $mydict(bravo) -> b > puts $mydict(charlie) -> error: reference to nonexistent variable "c" Ditto. > Clearly, it's problematic to store references to nonexistent > variables. Especially bad is the interactive case, since the shell > will try to print the string representation of the return value of > every command, but this is an error for the second [set]. The user > will think that the error was because [set] failed, but in fact [set] > succeeded but the shell failed. I'm tempted to call the whole thing > off, but references to nonexistent variables are essential, or else it > would be impossible to ever create a variable in the first place! So > I think that shells and interactive debuggers need an alternative to > taking the string representation, something that breaks down the > structure a little bit more. I'm thinking of how Tkcon has "hot" > errors that can be clicked to get the backtrace. If the above was > entered into a program like Tkcon, the displayed return value for the > second line would be "alpha 1 bravo b charlie _&c_", where _&c_ is > underlined, in red, and can be clicked for more information. Tricky > stuff! I've had a similar idea to deal with very long strings. One of my goals is to handle very large datasets transparently and painlessly. For example, imagine building a list containing one element that is a large memory-mapped file. This would take ages to display, especially in an interactive session. Non-interactive apps don't have this problem because they seldom need the string rep of their objects (unless they induce shimmering, which is avoidable). So the idea is to display an abbreviated string with a button to expand the elided part. For example: % set v [string repeat "0123456789" 10000] 012345678901234567890123456789<... 99940 chars skipped>012345678901234567890123456789 set l (foo $v) % foo 01234567890123456789012345<... 99944 chars skipped>012345678901234567890123456789 The idea could be extended not only to strings but to containers as well (lists). So you could apply this concept to errors caused by stale references. (Note that Cloverfield avoids this class of problems by design since references always have a valid string rep and a known value in case they are unresolved, although this may change if we allow the string rep generation to fail, as is possible with file backed memory mapping.) > Now for pointer trickery. Thankfully, pointers always have string > representations, even when they point to variables that don't exist > yet. ;^) So I get that your @ suffix is syntactic sugar for an explicit pointer dereference. > Now let me try out the "anonymous references" feature. Which is > probably not a good name for it. > > proc &whatever () {return (5 4 3 2 1 0)} > set &data [whatever] > puts $data{0} -> 5 > puts $&[whatever]{0} -> 5 > puts $&(5 4 3 2 1 0){0} -> 5 > > Look ma, no [lindex]! Interesting. This allows the variable indexing syntax without an actual variables. Kind of C++ anonymous objects. > > > > - Numbers can be expressed in sexagesimal notation: 12'34'56.78 . > > > > Interesting. Consider also Eiffel-style grouping using underscore, > > e.g. 1_000_000 is one million. This improves readability a lot. > > Are the underscores simply ignored, or are there rules about where > they can be placed? Eiffel requires 3-digit groups (i.e. thousands) but a more liberal rule would simply discard them. After all digit grouping is a very cultural matter: Westerners use 3 (a legacy of the Roman empire?), the Chinese and Japanese sometimes use 4-digit grouping, the Indian system uses mixed 2 and 3. > > > - An object can have multiple representations, not just two. > > > > I plan the same feature for Cloverfield. I'd suggest using simple > > arrays or linked lists instead of hash tables, since the number of > > internal reps would be typically small. A hash table would be > overkill > > for that IMHO. > > Yeah, a short array would probably be a better choice. There will > only ever be a handful of valid representations at a time, typically > just one or two. Would it be worthwhile to keep the string separate > from the array of internal representations, or can it be lumped with > the rest as "just another representation"? I think I favor the > latter. What is the benefit of special-casing the string > representation? I'm willing to take the opposite approach: while Tcl defines objects that have a string rep, I'm working on a system where all values ARE strings in the sense that they respond to the string interface, their actual representation being arbitrary (and possibly multiple or layered). This means that you can merge (in the string sense) two "objects" without caring about their actual string rep. In short, objects are ropes. This is one of the big advantages of doing away with the flat strings: in the grand OO tradition, ropes define an interface, not a structure. > Thanks for taking the time to read my super-long emails. Hehe! Two brains are better than one. Cheers, Fred. |
From: Andy G. <unu...@ai...> - 2009-01-19 16:02:03
|
"Frédéric Bonnet" <fre...@fr...> wrote: > So you're doing away with insert and replace it with set on a > pseudo-index. Interesting, I like the idea. It also unifies > insert/replace/append depending on the index and whether the sublist > is empty or not. [list insert] and [list replace] will still exist for use in functional contexts. For example, they can be used to modify the result of a command substitution before passing it to another command as an argument, without using an intermediate variable. [list index] will likewise exist, the same as single-argument [set], even though both are (hopefully!) the same as certain substitution mechanisms built into the language. But sometimes you simply must have a command name. But yes, my idea is to simplify the common imperative case. "Construct a list by doing X to the value of FOO and store that list into FOO." That's what Tcl has (except for [lset] and [lappend]). "Do X to FOO." That's what I'm shooting for. I do not want to have [lset] and [lappend]. I don't want [dict append], [dict incr], [dict lappend], [dict set], [dict unset], or [dict update] either. [dict with] I will have to think about. If imperative is the opposite of functional, does that mean it's dysfunctional? Tcl supports both. Does that mean it's schizophrenic? ;^) > BTW I suppose that for symmetry unset removes elements. Yup! I want to minimize the number of commands that have side effects. It's like the Sluggy Freelance Harry Potter parody. The question came up: Why is everyone in House Wunnybun (Slytherin) evil? The answer: We like to keep all our bad guys in one place; it makes the paperwork simple. So I'd like to keep all of our "bad guys" (commands that change the state of the interpreter) in one place: [set], [unset]. Also I'd like for as many other imperative commands as possible to be (conceptually) implemented in terms of [set] and [unset]; for example, the way I have lambdas and command invocation defined, [proc &name arglist body] is like [set &name (arglist body)]. Can we make [set &var ${var}suffix] as fast as [append &var suffix]? > However are references to sublists "live"? I think I should put off the sublist (a.k.a. slice) reference question until I can nail down the single-element case! Here, try this: set &var (0 1 2 3) set &ref1 &var{end} set &ref2 &var{3} puts "$ref1 $ref2" -> 3 3 set &var{-1} prepend puts $var -> prepend 0 1 2 3 puts "$ref1 $ref2" -> 2 3 set &var{end+1} append puts $var -> prepend 0 1 2 3 append puts "$ref1 $ref2" -> 2 append &ref1 and &ref2 originally referred to the same element, but after prepending to the list, they wound up referring to different elements. Why? Because &var{end} and &var{3} only refer to the same element when &var is a four-element list. I think this might be an example of what you call "live". I'm pretty sure this quandary is due to my reference behavior being incorrect. Perhaps at time of reference construction, the reference should be bound to a particular element, without regard for the possibility of the "path" changing over time. Example: set &var (0 1) set &index 0 set &ref &var{$index} puts $ref -> 0 set &index 1 puts $ref -> 0 Here, &ref continues to refer to 0 even after changing $index. Using the same semantics, the earlier code and printouts change to: set &var (0 1 2 3) set &ref1 &var{end} set &ref2 &var{3} puts "$ref1 $ref2" -> 3 3 set &var{-1} prepend puts $var -> prepend 0 1 2 3 puts "$ref1 $ref2" -> 3 3 set &var{end+1} append puts $var -> prepend 0 1 2 3 append puts "$ref1 $ref2" -> 3 3 Now the question is: what happens on [unset]? I think that unsetting one of the references should just delete the reference. Unsetting &var or &var{4} will cause the references to lose their string representation. After that, what if these references are used to set a value again? If &var was unset, I imagine this will cause &var to be created and set to a single-element list. If &var{4} was unset, this may cause a new element to be inserted between 2 and append. Another option which may be preferable is to forbid using a reference to a deleted element as the argument to [set]. A third option is to allow it but make no attempt to modify &var; perhaps they lose their referencehood and just become ordinary variables; or perhaps &ref1 and &ref2 continue to be linked, but they are no longer linked to &var. Tricky stuff!! I hope this isn't premature, but let me take a stab at sublist references. If a single-element reference is bound to a particular element at the time the reference is constructed, then a sublist reference should be bound to a pair of "anchor" elements, which serve as indexes. set &var (0 1 2 3) set &ref &var{1:2} puts $ref -> 1 2 set &var{2:} 1.5 puts $var -> 0 1 1.5 2 3 puts $ref -> 1 1.5 2 On [unset] of an anchor element, the sublist reference could become a reference to a deleted object, as described above. I don't think I like this. It could also remain valid but have its anchor element updated, like so: set &var (0 1 2 3) set &ref &var{1:2} unset &var{1} puts $ref -> 0 2 unset &var{2} puts $ref -> 0 3 unset &var{0} puts $ref -> 3 unset &var{0} puts $ref -> {} set &var (a b c d e f g h i j k l m n o p q r s t u v w x y z) puts $ref -> a b c d e f g h i j k l m n o p q r s t u v w x y z The last line is explained by the anchor elements having become the start and the end of the list, so effectively &ref became a reference to &var, not to any particular sublist. I don't think I like this either, since it's otherwise not possible to create a reference to a sublist anchored at end, where the meaning of "end" updates when elements are appended to the list. I sincerely hope it doesn't become idiomatic to add an extra element to a list, create a reference, then delete the extra element. So there's more problems. Everything I said above also applies to pointers. Your thoughts? > On my side I've worked a bit on the whole concept of vector indexing, > and I'm replacing it by the more general concept of "selector" of > which flat indexing is the simplest case. More on that later (I'm > working on a revision of the syntax). I'd be interested to see if this simplifies any of this stuff I've been thinking about. > I don't want to provide features that are too difficult to use (or > implement) from C as I want to keep the dual nature of Tcl. The dual nature of Tcl? Do you mean how it is useful both as a scripting language and a C library? Either can be used without the other, and they can be used together within a single project. > So with the concept of selector I can address the various access > semantics that usual container classes can provide (the C++ STL is > a good start) Much of the STL is made unnecessary by Tcl's built-in strings, lists, and dictionaries. So it is not necessary to emulate all of it. Sure, use it as a source for ideas, but don't set any semblance of compatibility with it as a goal. Take iterators, for example. I suppose they could be similar to pointers, except that they are constructed by means of a command and that they are callable and have subcommands. Like pointers, they can be "followed" with a trailing @. Unlike pointers, they can be updated to point to the "next" element (or whatever), using a subcommand. By callable, I mean that one can type the name of the variable containing the pointer as the first word of the command line. But also consider that C++ iterators over linked lists are important for performance but harder to use than numeric indexes into arrays. Tcl doesn't have this performance problem (*cough* *cough* what I really mean is that it's not something that can or should be dealt with at the script level), so what's the point of iterators over linked lists? C++ iterators can be stored and passed around, but so can my pointers and references. C++ iterators can be compared, incremented, and decremented. Is that worth it? Instead just pass the whole damn list! Or a sublist, or reference thereto. Want reverse iterators? Use -1 stride. > allowing custom object types to introduce their own indexing methods > (e.g XPath for XML elements). Still, the two types of indexing will be vectored and keyed. You're saying that custom objects can provide their own implementations for these. I'd like to see this power exposed at the script level. ;^) -- Andy Goth | http://andy.junkdrome.org/ unununium@{aircanopy.net,openverse.com} |
From: Frédéric B. <fre...@fr...> - 2009-01-19 10:11:09
|
Hi, ----- "Andy Goth" <unu...@ai...> a écrit : > Today I spent some time contemplating vector index references and how > they can be used. [...] > 2. {index:} > - refers to the empty list "between" index and its predecessor > - if end, refers to the space before the last element > - if zero or negative, refers to the space before the first > element > - if equal to or greater than the list length, refers to the space > after the last element [...] > Examples of {index:}: > > set &var{0:} (W X) -> W X Y Z A b c D E F > set &var(-9:} (V) -> V W X Y Z A b c D E F > set &var{1:} (0 1) -> V 0 1 W X Y Z A b c D E F > set &var{end:} (2 3) -> V 0 1 W X Y Z A b c D E 2 3 F > set &var{end+1:} (G H) -> V 0 1 W X Y Z A b c D E 2 3 F G H > set &var{end+9:} (I) -> V 0 1 W X Y Z A b c D E 2 3 F G H I So you're doing away with insert and replace it with set on a pseudo-index. Interesting, I like the idea. It also unifies insert/replace/append depending on the index and whether the sublist is empty or not. BTW I suppose that for symmetry unset removes elements. However are references to sublists "live"? This is a really big can of worms you're opening there at it implies a lot of bookkeeping. On my side I've worked a bit on the whole concept of vector indexing, and I'm replacing it by the more general concept of "selector" of which flat indexing is the simplest case. More on that later (I'm working on a revision of the syntax). This is also subject to evolve with my progress on the implementation, because I want to keep the syntax and the implementation on par: I don't want to provide features that are too difficult to use (or implement) from C as I want to keep the dual nature of Tcl. So with the concept of selector I can address the various access semantics that usual container classes can provide (the C++ STL is a good start) from both the script and C levels, allowing custom object types to introduce their own indexing methods (e.g XPath for XML elements). So long, Fred |
From: Andy G. <unu...@ai...> - 2009-01-17 08:08:13
|
Today I spent some time contemplating vector index references and how they can be used. In my language (to date), there are really two kinds of vector index references: references to elements and references to sublists ("slices", in Python parlance). An individual index has one of the following forms: 1. expression ([expr]-style, must evaluate to an integer) 2. end 3. end-expression 4. end+expression Examples: "5", "$a", "$a*$b", "end", "end-$a*$b", "end - $a * $b", "end-1" Case (3) above is equivalent to case (4) with a "-" prepended to the expression, so "end-$a+$b" means "end+(-$a)+$b" and not "end-($a+$b)". In case (2), (3), and (4), "end" is numerically equal to the list length minus one. A vector index has one of the following forms, where index and index2 are defined according to the above table: 1. {index} - refers to the single element at index - if end, refers to the last element - if negative, refers to the nonexistent element before the first element - if equal to or greater than the list length, refers to the nonexistent element after the last element 2. {index:} - refers to the empty list "between" index and its predecessor - if end, refers to the space before the last element - if zero or negative, refers to the space before the first element - if equal to or greater than the list length, refers to the space after the last element 3. {index:index2} - refers to the list of all elements starting at index and ending at index2, inclusive - equivalent to form (4) with +1 stride - if index2 is less than index, then equivalent to form (2) 4. {index:index2:stride} - refers to the list of all elements starting at index and ending at index2 - only every stride'th element is included - the index element is always included - the index2 element may not be included, depending on stride - stride is an [expr]-style expression - stride must be a positive or negative integer, but not zero - if index is equal to index2, refers to the single-element list containing index, regardless of stride - if the sign of stride is not equal to the sign of index2-index, equivalent to form (2) Examples of {index}: set &var (a b c) -> a b c d set &var{0} A -> A b c d set &var{end} D -> A b c D set &var{end+1} -> error: doesn't exist set &var{end+1} E -> A b c D E set &var{end+9} F -> A b c D E F set &var{-1} -> error: doesn't exist set &var{-1} Z -> Z A b c D E F set &var{-9} Y -> Y Z A b c D E F Examples of {index:}: set &var{0:} (W X) -> W X Y Z A b c D E F set &var(-9:} (V) -> V W X Y Z A b c D E F set &var{1:} (0 1) -> V 0 1 W X Y Z A b c D E F set &var{end:} (2 3) -> V 0 1 W X Y Z A b c D E 2 3 F set &var{end+1:} (G H) -> V 0 1 W X Y Z A b c D E 2 3 F G H set &var{end+9:} (I) -> V 0 1 W X Y Z A b c D E 2 3 F G H I It's well past my bedtime... I will write more tomorrow! -- Andy Goth | http://andy.junkdrome.org/ unununium@{aircanopy.net,openverse.com} |
From: Andy G. <unu...@ai...> - 2009-01-15 17:08:48
|
"Frédéric Bonnet" <fre...@fr...> wrote: > Still alive ;-) "I'm doing science and I'm still alive!" > What you describe sounds like skiplists. Yeah, I think you're right. > OTOH my implementation is "container" based (if I understand what > you mean by container), I was thinking of a "classical" tree implementation wherein the tree nodes are malloc'ed on the heap, contain pointers to each other, and have pointers to individually malloc'ed string data. I was contrasting that with the string pointers pointing into flat data already allocated elsewhere in large chunks, so that the number of string allocations is (or can be) much smaller than the number of leaf nodes. > I get very good performances thanks to the allocator and GC. With the > stock allocator I think I'll get the locality problems that you > mention. I will definitely have to check out your code. Perhaps tonight after work. Or maybe if I get sufficiently bored, I won't wait until I get home. :^) That's not likely to happen; I have to port a huge application from Qt3 to Qt4, and it uses Qt Designer and custom widgets. By "that's not likely to happen", I of course mean that it's not likely I will ever get home. ;^) > There would be no difference between a rope-based string and a > btree-based list, aside from memory management: as lists contain > objects, they must be traversed as well by a mark-and-sweep GC. I suspect performance would suffer a little bit if the garbage collector had to separately track each character. ;^) I'm wondering if the GC can be extended to treat sequential blocks of objects as a single entity, so that it can handle both a character string and a single large object without special casing. This might not worthwhile if character strings are the only place where the number of objects in the block is greater than one. Try to think of cases where it will be useful for the GC to treat an array of something other than characters as a single logical unit. > Regarding performances, some benchmarking I've run shows a real bonus > when handling large objects: Interesting. How will the block size limit impact the extension programmer when dealing with large objects? I guess it will be possible to subdivide large objects into smaller objects with some kind of master object that points to them. And arrays (which I find are the main reason why objects get to be large) can be handled by this new code you're about to write. > - traversal is near linear in most case because we can keep contextual > info in iterator structures and directly access the leaves that > contain actual data Can you explain this for me? > Since memory block size is bounded, heap fragmentation is less > critical, as large data can be spread over several disjoint blocks. Awesome. ;^) I remember dealing with a certain much-hated image generator that sometimes would SIGBUS on startup when run on IRIX. Eventually I figured out that this was due to a debug option that caused it to malloc a single block of memory many hundreds of megabytes in size. Sometimes IRIX couldn't find a place in memory for it to go, so the malloc failed, but the image generator assumed success. To fix, I had to reduce the size of the array, but this also reduced the period of time when the system would be able to record debug data. Since the developers apparently had never heard of ring buffers, after this time expired, it just stopped capturing data, so I had to work fast. And also it would collect data during all the time it spent "sitting on the runway" while waiting on the host system to come up to speed, the operator to create an exercise, and the pilot to preflight the jet, ignite the engines, and take off. For some reason, having the image generator guys fix their stupid code was out of the question, even though they should have been paying me for (black-box!) debugging and profiling their system for them. Shrug! > memory allocation is actually much faster with ropes than with flat > lists, which more than compensates for the overhead. The primary overhead is additional burden placed on the programmer to keep everything within size limits. > So in practice flat allocations delegate the algorithmic complexity to > the heap allocator. The traditional approach is to abstract stuff like this away so that the programmer doesn't have to worry about it. But the costs don't go away; they just get hidden and forgotten. Some of the most interesting designs I've seen arose from reëvaluating the historical layers of abstraction. Simple example: simplify the CPU by requiring that things be aligned, use the extra die area for registers or cache or execution units, and improve the compiler to statically deal with alignment. Basically relocate complexity from a place where it is expensive (in the CPU) to a place where it is cheap (in the compiler). > Moreover a tree-based rope means that several ropes can share the same > data. I'm counting on this. ;^) One rope will contain the entire script, another will contain a procedure body, a third will contain a list, a fourth will contain a string. The procedure is a subset of the script, the list is a literal entity in the procedure, and the string is an element in the list. Yet across all these ropes there is only one copy of the string I mentioned. Even after being concatenated with something else, there is only one copy of it, even if that concatenation is used as a second procedure body in programmatically generated code. > Regarding Tcl, this means that the string rep of a container can > share a large part of its elements' and take up much less memory > than flat strings (besides being faster to allocate and build). Will there be heuristics to decide when to join lots of tiny strings into a new array? If the program concatenates thousands of single-character strings, will this be efficient with your current rope code? > "Andy Goth" <unu...@ai...> a écrit : > > Argument expansion has proven to be very valuable to Tcl, so of > > course it stays. But I don't extend it to allow multiple levels of > > expansion. > > In Cloverfield it is only extended to be an accepted markup syntax in > list string reps. set a (0 1 2) set b ({*}$a 3 4 5) set c {{*}$a 3 4 5} puts [lindex $b 0] puts [lindex $c 0] What will this print? Pay careful attention to the use of parentheses versus braces. > I don't think multiple levels of expansion is useful > (not sure what you mean by that). set a (((0 1) (2 3)) ((4 5) (6 7))) set b {*}$a set c {*}{*}$a would set a to ((0 1) (2 3) (4 5) (6 7)) and b to (0 1 2 3 4 5 6 7). But I don't think this is worth having. The Tcl approach is nice and simple. Just recognize it once. > #{ }# should nest so that the following works as expected: > > #{ if {$cond1} { }# > if {$cond2} { > ... > } > > This means that sole braces don't match comment openings, inline > comments are delimited by complete digrams. That's a better idea; I hadn't considered the need to put unbalanced braces inside inline comments. Which is silly of me, since I know we support unbalanced braces inside whole-line comments, even when those comments are inside brace-quoted script bodies. While on the subject of comments, there's something I forgot to mention in my previous email. Like in Cloverfield, I want to allow ordinary whole-line comments to begin anywhere a word can. In Tcl they can only begin where a command can. This is confusing to many people. Although, to be perfectly honest, restricting them to only begin at word starts is confusing too, but only because this is contrary to many other popular languages that allow them everywhere except inside quotes. But I think this restriction is perfectly valid, because quoting is also only allowed to begin at word starts, and I like the symmetry. (Shell-style languages allow both comments and quotes to begin anywhere, so they're symmetric as well.) > > One kind, which somewhat corresponds to C pointers, requires > > explicit dereferencing. The string representation of such a > > reference is "object 123" or similar. The other kind, which > > somewhat corresponds to C++ references, is dereferenced > > automatically. The string representation is the instantaneous > > value of the referent variable. > > Your references seem close to Cloverfield's. Your pointers seem to be > opaque handles like Jim's references. Yes, the pointers are opaque handles. The language directly supports "dereferencing" them, by mixing @ into a $ substitution. But I have to be careful with my terms here, since "dereferencing" a pointer will yield a (transparent, implicit) reference to the variable, which "decays" to the instantaneous value of the pointed-to variable. (A pointer is an opaque, explicit reference.) So is "dereferencing" the right word to use for the @? I probably should find a different word, like "traverse" or "follow". You "follow" a pointer to obtain a (transparent) reference to the variable it points to. > I don't get all of it but it sounds interesting. Example code > welcome! I wrote this last night, but didn't save it. I'll type it from memory. It demonstrates references (not pointers) and the funky expanded proc notation. proc &whatever (a? (b default_value) c d* e?) { if {[info exists &a]} { puts a=$a } puts b=$b puts c=$c puts d=$d if {[info exists &e]} { puts e=$e } } whatever c whatever a c whatever a b c whatever a b c e whatever a b c d1 e whatever a b c d1 d2 e All the object names are written either as a reference or a substitution, so that the interpreter knows that they are names. I believe this is important for indexing. Note: proc's second argument does not do this, because otherwise proc would receive a list of references to global variables. I hope this seeming inconsistency is not too confusing. :^( The proc dispatch code first binds all required arguments to variables. Next it binds optional arguments, from left to right. Last it binds variadic arguments. This approach is backward-compatible with the traditional Tcl/C++/Python style of requiring required arguments, optional arguments, and variadic arguments to appear in that order in the proc or function definition, but it maps more closely with code that takes a required argument at the end and supports options at the beginning. Here's more. set &a 1 set &mydict (alpha &a bravo b charlie &c) puts $mydict -> error: reference to nonexistent variable "c" puts $mydict(alpha) -> 1 puts $mydict(bravo) -> b puts $mydict(charlie) -> error: reference to nonexistent variable "c" puts $c -> error: no such variable "c" set &c 0 puts $mydict -> alpha 1 bravo b charlie 0 set &mydict(alpha) 2 set &mydict(bravo) 3 set &mydict(charlie) 4 puts $mydict(alpha) -> 2 puts $mydict(bravo) -> 3 puts $mydict(charlie) -> 4 puts $a -> 2 puts $c -> 4 Clearly, it's problematic to store references to nonexistent variables. Especially bad is the interactive case, since the shell will try to print the string representation of the return value of every command, but this is an error for the second [set]. The user will think that the error was because [set] failed, but in fact [set] succeeded but the shell failed. I'm tempted to call the whole thing off, but references to nonexistent variables are essential, or else it would be impossible to ever create a variable in the first place! So I think that shells and interactive debuggers need an alternative to taking the string representation, something that breaks down the structure a little bit more. I'm thinking of how Tkcon has "hot" errors that can be clicked to get the backtrace. If the above was entered into a program like Tkcon, the displayed return value for the second line would be "alpha 1 bravo b charlie _&c_", where _&c_ is underlined, in red, and can be clicked for more information. Tricky stuff! I prefer EIAS, but I don't know how to do it in this specific case. There can be an [info] command to get the reference status of a word, returning a tagged list giving the partial string representation and information about the references. I should note that most programs won't have to bother with this, only interactive consoles that take it upon themselves to print all return values, even when the user doesn't ask for them. Now for pointer trickery. Thankfully, pointers always have string representations, even when they point to variables that don't exist yet. ;^) set &mydict(alpha) @mydict(charlie) # (object 1) is now a pointer to &mydict(charlie) puts $mydict -> alpha (object 1) bravo 3 charlie 4 puts $mydict(alpha) -> object 1 puts $mydict(alpha)@ -> 4 puts $mydict(alpha)@@ -> error: invalid pointer "4" set $mydict(alpha)@ @d # (object 2) is now a pointer to &d puts $mydict -> alpha (object 1) bravo 3 charlie (object 2) puts $mydict(alpha) -> object 1 puts $mydict(alpha)@ -> object 2 puts $mydict(alpha)@@ -> error: pointer to nonexistent variable "d" puts $d -> error: no such variable "d" set $mydict(alpha)@@ @a # (object 3) is now a pointer to &a puts $mydict -> alpha (object 1) bravo 3 charlie (object 2) puts $d -> object 3 puts $mydict(alpha) -> object 1 puts $mydict(alpha)@ -> object 2 puts $mydict(alpha)@@ -> object 3 puts $mydict(alpha)@@@ -> object 1 puts $mydict(alpha)@@@@ -> object 2 puts $mydict(alpha)@@@@@ -> object 3 Let's analyze the line "set $mydict(alpha)@@ @a". "$mydict(alpha)" substitutes to "object 1", which is followed (first @) to obtain "object 2", which is followed again (second @) to obtain a reference to d. So set's first argument is &d. "@a" substitutes to "object 3", and as a side effect this pointer is added to the interpreter's pointer lookup table. set creates d ('cuz it doesn't already exist) and stores into it the string "object 3" with internal representation "pointer to a". Remember that $mydict(alpha) is &a, so "object 3" points to the same data as $mydict(alpha). This explains the circular reference. Another: set &mydict (alpha @a bravo @b) set &a (1 2 3 4) set &b (5 6 7 8) puts $mydict -> alpha (object 1) bravo (object 2) puts $mydict(alpha) -> object 1 puts $mydict(alpha)@ -> 1 2 3 4 puts $mydict(alpha)@{0} -> 1 puts $mydict(alpha)@(1) -> 2 See how easily @ can be intermixed with indexing? Also it's possible to create multiple pointers or references to the same variable before actually creating the variable. Pointers to a and b are obtained on line 1, then references to those same variables are obtained on lines 2 and 3. Moreover, see how much more sane this last example is? The only place it ever puts references to nonexistent variables is into the first argument of set, which never tries to take the string representation of its first argument. Now let me try out the "anonymous references" feature. Which is probably not a good name for it. proc &whatever () {return (5 4 3 2 1 0)} set &data [whatever] puts $data{0} -> 5 puts $&[whatever]{0} -> 5 puts $&(5 4 3 2 1 0){0} -> 5 Look ma, no [lindex]! > > - Numbers can be expressed in sexagesimal notation: 12'34'56.78 . > > Interesting. Consider also Eiffel-style grouping using underscore, > e.g. 1_000_000 is one million. This improves readability a lot. Are the underscores simply ignored, or are there rules about where they can be placed? > > - Variables, procs, commands, channels, etc. are all in a single > > object store. Objects can be local to a stack frame. > > Would this allow RAII style programming? Yes, the objects will have suitable destructors which will be executed when the garbage collector reaps the final reference. For example, a channel will be closed. Having the garbage collector work on more than just data objects will be a major boon to functional programming. Something similar to TclOO should be exposed at the script level, except that the user-defined destructor is automatically invoked by the garbage collector when the last reference goes away due to unset or falling off the stack. > > - An object can have multiple representations, not just two. > > I plan the same feature for Cloverfield. I'd suggest using simple > arrays or linked lists instead of hash tables, since the number of > internal reps would be typically small. A hash table would be overkill > for that IMHO. Yeah, a short array would probably be a better choice. There will only ever be a handful of valid representations at a time, typically just one or two. Would it be worthwhile to keep the string separate from the array of internal representations, or can it be lumped with the rest as "just another representation"? I think I favor the latter. What is the benefit of special-casing the string representation? > > - Word origins are tracked and accessed with [info origin]. This > > facilitates syntax error messages. > > This would serve the same purpose of Cloverfield's {metadata}. > Typically a debug-only feature. We will have to investigate the run-time overhead of supporting this. I had the hope that this feature can always be available, since it will make it possible to put filenames and line numbers in backtraces. When code or data comes from something other than a file, things get really interesting. If it's read from a network socket, what are the filename and line number? Perhaps the IP address and timestamp. If it's read from the GUI, now what is the line number? Heh. > > And that's what I have so far... > > Can I steal a couple of ideas? ;-) No, you can't have them! They're MINE!! ;^) Well, maybe just a few. If you promise to share some of yours. :^) I'm a couple hours late for work. Time to go!! Thanks for taking the time to read my super-long emails. -- Andy Goth | http://andy.junkdrome.org/ unununium@{aircanopy.net,openverse.com} |
From: Frédéric B. <fre...@fr...> - 2009-01-15 14:09:53
|
Hi! ----- "Andy Goth" <unu...@ai...> a écrit : > fre...@fr... wrote: > > Hi Andy! > > It is good to hear from you again. I was worried that you had fallen > off the edge of the earth, or (worse), off the edge of the Internet! Still alive ;-) > > I've had some time to work on my rope thing (as you can read from > the > > message I've just posted) > > I did read it (the email, not the sources), and I am quite excited. > I've been thinking about ropes and whether they're really worth it, > and the conclusion I came to is that they can help performance when > they are implemented as indexes over concatenations of strings of > mixed character width, especially if the strings are mostly contiguous > in memory, such as when they were read from a file. But if the ropes > are "merely" the container, then they negatively impact locality of > reference. Your email agrees with my intuition and backs it up with > both implementation and measurements. Very cool! What you describe sounds like skiplists. OTOH my implementation is "container" based (if I understand what you mean by container), but I get very good performances thanks to the allocator and GC. With the stock allocator I think I'll get the locality problems that you mention. > > > I have some idea for the next stage, AKA the Grand Unification > Scheme of > > string and object structures. > > The ropes I had in mind were actually indexes over arrays of arbitrary > objects. Characters are just one popular type of object. ;^) Yep. There would be no difference between a rope-based string and a btree-based list, aside from memory management: as lists contain objects, they must be traversed as well by a mark-and-sweep GC. But regular operations (insertion, extraction, removal) are identical. I'm certainly going to reuse most of the code I've written on ropes to implement my new lists. Regarding performances, some benchmarking I've run shows a real bonus when handling large objects: while flat allocation (arrays) is algorithmically simpler, allocation of large memory areas take up much more CPU over time because the heap becomes fragmented, and as the memory is non movable the heap is not compactable. This requires very precise strategies (object pools etc.) that can become as complex as writing a custom allocator. Whereas my rope scheme limits the size of contiguous memory blocks to a bit less than a logical page (1KB). This implies that larger data needs to be split into smaller chunks, but in practice the impact is minimal because: - most source data is small: scalar objects, command lines and arguments, etc. - larger data is most often composite - for larger data, random access is O(logn) so the impact of the splitting is limited - traversal is near linear in most case because we can keep contextual info in iterator structures and directly access the leaves that contain actual data Since memory block size is bounded, heap fragmentation is less critical, as large data can be spread over several disjoint blocks. So memory allocation is actually much faster with ropes than with flat lists, which more than compensates for the overhead. Besides, while rope data may be fragmented, memory locality is actually very good because higher order nodes (concat, substr) take up one cell each and thus can be allocated in consecutive cells and fill up holes. For large strings, leaf nodes (those containing char data) take up whole pages, and higher order nodes are allocated next to each other. So traversal benefits greatly from the improved locality (much more than randomly allocated malloc blocks), and for data access there is no different compared to flat strings, since the processor's cache lines don't necessary contain consecutive pages even for flat strings. So in practice flat allocations delegate the algorithmic complexity to the heap allocator. Moreover a tree-based rope means that several ropes can share the same data. This implies that the sum of the lengths of all ropes can far exceed the actually allocated memory. Take a look at Rope_Repeat for example: it can create ropes up to the theoretical limit (4GB) by simply concatenating a rope onto itself, thus consuming a very small amount of memory. Regarding Tcl, this means that the string rep of a container can share a large part of its elements' and take up much less memory than flat strings (besides being faster to allocate and build). Pushing the idea to the extreme, one could build a rope containing ALL the strings of a running app without doubling the memory consumption all of a sudden. Very useful for persistence: build a rope that describes the whole state of the app, dump into file, and restore later. This could provide a feature similar to Smalltalk's VM images. > I made substantial changes to the syntax, to the point of it being > unfair to call my work "Cloverfield". I haven't come up with another > name yet. I warn you: this is all very preliminary, and it's surely > not self-consistent. It's just ideas. > > One major difference is that I backed off completely from the word > modifier idea. I consider dropping some modifiers as well. The whole idea is quite controversial as I found out. [...] > With null gone, I next looked at metadata. I decided to throw it out > as well since its uses can be handled instead by data structures. For > example, I often need multiple pieces of code to track internal > "annotations" they place on common data. They most easily do this by > using the data as the index into an internal associative array. But > with the Cloverfield metadata facility, either only one annotation at > a time is allowed, or every module in an application has to agree on a > convention for not stomping on every other module's annotations. The use case I had in mind for metadata was debugging tools. Metadata is not intended for general or library use. The idea would be that debugging tools could tag values with specific metadata, for example creation context, for later retrieval. As this is a very specific use case I consider reserving the metadata modifier to a debug version, and maybe renaming it to {debug} to be more explicit. But it is a very restricted use case so creating a word modifier is debatable. > I dropped delayed substitution since I can't think of a good use for > it. I got inspired from Lisp's delayed execution. It's very useful in itself but I'll drop it as well if there are other ways to achieve the same purpose, or if it's too difficult to implement. Moreover the semantics needs precise definition, and may need full-fledged closure and/or reference support. > Argument expansion has proven to be very valuable to Tcl, so of course > it stays. But I don't extend it to allow multiple levels of > expansion. I have never seen a case where this would help, and it can > be implemented in script. In Cloverfield it is only extended to be an accepted markup syntax in list string reps. I don't think multiple levels of expansion is useful (not sure what you mean by that). > I really want word comments, but I don't want to think of them as > "word"-anything. Instead I use the more familiar term "inline > comments", and the syntax is #{...}# , where the opening #{ starts > wherever a word can. Braces are matched (modulo embedded quoting), > and it is an error if a # does not immediately follow the final > closing brace. (Note: Vim's syntax highlighter cannot handle nested > comments, so I might have to submit a patch.) My revision will include this syntax as well (dropping the ill-conceived word comments in the process), following a past discussions on this list during which we came to the conclusion that #{ }# should nest so that the following works as expected: #{ if {$cond1} { }# if {$cond2} { ... } This means that sole braces don't match comment openings, inline comments are delimited by complete digrams. > I have a very different idea for references, to be explained below. > But it is not implemented in terms of word modifier notation. > [...] > > On to references! At the moment (I can change my mind in an > instant!), my design actually has two kinds of references. The > difference is in how they are dereferenced. One kind, which somewhat > corresponds to C pointers, requires explicit dereferencing. The > string representation of such a reference is "object 123" or similar. > The other kind, which somewhat corresponds to C++ references, is > dereferenced automatically. The string representation is the > instantaneous value of the referent variable. For now, I name these > "pointers" and "references", respectively. Your references seem close to Cloverfield's. Your pointers seem to be opaque handles like Jim's references. [... more details about pointers and references ...] I don't get all of it but it sounds interesting. Example code welcome! > Miscellaneous: > > - Numbers can be expressed in sexagesimal notation: 12'34'56.78 . ' > is used instead of : to avoid conflict with ?: . I do a lot of GIS > stuff at work, so this would be directly useful to me. It can be used > for time as well as geography. Interesting. Consider also Eiffel-style grouping using underscore, e.g. 1_000_000 is one million. This improves readability a lot. > - Variables, procs, commands, channels, etc. are all in a single > object store, with string pseudo-values "object 123", "command 123", > "channel 123", etc., kind of like in Python. Variables and procs have > proper string representations. [unset] is used to destroy any > reference to an object, and the object is garbage-collected when no > references (e.g. variables) remain. Objects can be local to a stack > frame. Would this allow RAII style programming? > - An object can have multiple representations, not just two. > Non-string conversions may avoid an intermediate string form. > Optimized for common dual-ported case. Hash table used for tracking > multiple representations, and stale representations are removed from > hash table. Hash table created when multiple non-string > representations are valid. Example: taking the list representation of > a dict. I plan the same feature for Cloverfield. I'd suggest using simple arrays or linked lists instead of hash tables, since the number of internal reps would be typically small. A hash table would be overkill for that IMHO. > - Word origins are tracked and accessed with [info origin]. This > facilitates syntax error messages. This would serve the same purpose of Cloverfield's {metadata}. Typically a debug-only feature. > - Variadic and default arguments to proc can appear anywhere in the > argument list, and the "args" argument can have any name; just append > * to denote its catchall status, like in Python. Append ? to the name > to make its default "value" to leave the variable unset. Use a > two-element list to supply a real default value, like in Tcl. Interesting. I like the ? syntax. > And that's what I have so far... Can I steal a couple of ideas? ;-) Cheers, Fred |