From: <ibr...@us...> - 2011-05-14 11:12:11
|
Revision: 3962 http://tora.svn.sourceforge.net/tora/?rev=3962&view=rev Author: ibre5041 Date: 2011-05-14 11:12:04 +0000 (Sat, 14 May 2011) Log Message: ----------- newer libantlr version Added Paths: ----------- branches/tora-trotl/src/libantlr3c-3.3/ branches/tora-trotl/src/libantlr3c-3.3/AUTHORS branches/tora-trotl/src/libantlr3c-3.3/CMakeLists.txt branches/tora-trotl/src/libantlr3c-3.3/COPYING branches/tora-trotl/src/libantlr3c-3.3/ChangeLog branches/tora-trotl/src/libantlr3c-3.3/Makefile.am branches/tora-trotl/src/libantlr3c-3.3/NEWS branches/tora-trotl/src/libantlr3c-3.3/README branches/tora-trotl/src/libantlr3c-3.3/antlr3config.h.in branches/tora-trotl/src/libantlr3c-3.3/antlr3config.h.in.cmake branches/tora-trotl/src/libantlr3c-3.3/include/ branches/tora-trotl/src/libantlr3c-3.3/include/antlr3.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3baserecognizer.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3basetree.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3basetreeadaptor.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3bitset.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3collections.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3commontoken.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3commontree.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3commontreeadaptor.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3commontreenodestream.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3convertutf.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3cyclicdfa.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3debugeventlistener.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3defs.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3encodings.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3errors.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3exception.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3filestream.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3input.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3interfaces.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3intstream.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3lexer.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3memory.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3parser.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3parsetree.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3recognizersharedstate.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3rewritestreams.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3string.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3tokenstream.h branches/tora-trotl/src/libantlr3c-3.3/include/antlr3treeparser.h branches/tora-trotl/src/libantlr3c-3.3/src/ branches/tora-trotl/src/libantlr3c-3.3/src/antlr3baserecognizer.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3basetree.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3basetreeadaptor.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3bitset.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3collections.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3commontoken.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3commontree.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3commontreeadaptor.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3commontreenodestream.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3convertutf.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3cyclicdfa.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3debughandlers.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3encodings.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3exception.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3filestream.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3inputstream.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3intstream.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3lexer.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3parser.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3rewritestreams.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3string.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3tokenstream.c branches/tora-trotl/src/libantlr3c-3.3/src/antlr3treeparser.c Removed Paths: ------------- branches/tora-trotl/src/libantlr3c-3.2/ Added: branches/tora-trotl/src/libantlr3c-3.3/AUTHORS =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/AUTHORS (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/AUTHORS 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,41 @@ +The ANTLR recognizer generator tool was written by (with many contributions) + + Prof. Terence Parr at USF + +The C runtime and related C code generation elements were written by + + Jim Idle (jimi idle ws s/ /./g) with no contributions at all because + nobody else was crazy enough to do it. + +The C runtime and the ANLTR tool itself are distributed under the BSD license +which basically gives the right to so anythign with it, so long as you +recognize the authors. See here: + + [The "BSD licence"] + Copyright (c) 2005-2009 Jim Idle, Temporal Wave LLC + http://www.temporal-wave.com + http://www.linkedin.com/in/jimidle + + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + 3. The name of the author may not be used to endorse or promote products + derived from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Added: branches/tora-trotl/src/libantlr3c-3.3/CMakeLists.txt =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/CMakeLists.txt (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/CMakeLists.txt 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,116 @@ + +project(libantlr3c) +OPTION(ANTLR3_NODEBUGGER "ANTLR debugger not required" OFF) + +SET (VERSION "3.2") +SET (PACKAGE_BUGREPORT "ji...@te...") +SET (PACKAGE ${CMAKE_PROJECT_NAME}) + +SET (PACKAGE_VERSION ${VERSION}) +SET (PACKAGE_NAME ${CMAKE_PROJECT_NAME}) +SET (PACKAGE_TARNAME ${CMAKE_PROJECT_NAME}) +SET (PACKAGE_STRING "${CMAKE_PROJECT_NAME} ${VERSION}") + +cmake_minimum_required(VERSION 2.6) + +INCLUDE (CheckIncludeFiles) +INCLUDE (CheckFunctionExists) + +CHECK_INCLUDE_FILES ( "arpa_nameser.h" HAVE_ARPA_NAMESER_H ) +CHECK_INCLUDE_FILES ( "ctype.h" HAVE_CTYPE_H ) +CHECK_INCLUDE_FILES ( "dlfcn.h" HAVE_DLFCN_H ) +CHECK_INCLUDE_FILES ( "inttypes.h" HAVE_INTTYPES_H ) +CHECK_INCLUDE_FILES ( "malloc.h" HAVE_MALLOC_H ) +CHECK_INCLUDE_FILES ( "sys/malloc.h" HAVE_SYS_MALLOC_H ) +CHECK_INCLUDE_FILES ( "memory.h" HAVE_MEMORY_H ) +CHECK_INCLUDE_FILES ( "netdb.h" HAVE_NETDB_H ) +CHECK_INCLUDE_FILES ( "netinet/in.h" HAVE_NETINET_IN_H ) +CHECK_INCLUDE_FILES ( "netinet/tcp.h" HAVE_NETINET_TCP_H ) +CHECK_INCLUDE_FILES ( "resolv.h" HAVE_RESOLV_H ) +CHECK_INCLUDE_FILES ( "sys/resolv.h" HAVE_RESOLV_H ) +CHECK_INCLUDE_FILES ( "socket.h" HAVE_SOCKET_H ) +CHECK_INCLUDE_FILES ( "sys/socket.h" HAVE_SYS_SOCKET_H ) +CHECK_INCLUDE_FILES ( "stdarg.h" HAVE_STDARG_H ) +CHECK_INCLUDE_FILES ( "stdint.h" HAVE_STDINT_H ) +CHECK_INCLUDE_FILES ( "stdlib.h" HAVE_STDLIB_H ) +CHECK_INCLUDE_FILES ( "string.h" HAVE_STRING_H ) +CHECK_INCLUDE_FILES ( "strings.h" HAVE_STRINGS_H ) +CHECK_INCLUDE_FILES ( "sys/stat.h" HAVE_SYS_STAT_H ) +CHECK_INCLUDE_FILES ( "sys/types.h" HAVE_SYS_TYPES_H ) +CHECK_INCLUDE_FILES ( "unistd.h" HAVE_UNISTD_H ) + +CHECK_FUNCTION_EXISTS ( "accept" HAVE_ACCEPT ) +CHECK_FUNCTION_EXISTS ( "memmove" HAVE_MEMMOVE ) +CHECK_FUNCTION_EXISTS ( "memset" HAVE_MEMSET ) +CHECK_FUNCTION_EXISTS ( "strdup" HAVE_STRDUP ) + +IF(CMAKE_SIZEOF_VOID_P EQUAL "8") + SET(ANTLR3_USE_64BIT 1) +ENDIF(CMAKE_SIZEOF_VOID_P EQUAL "8") + +CONFIGURE_FILE(${CMAKE_CURRENT_SOURCE_DIR}/antlr3config.h.in.cmake ${CMAKE_CURRENT_BINARY_DIR}/antlr3config.h) + +set(ANTLR3_SRCS + src/antlr3baserecognizer.c + src/antlr3basetreeadaptor.c + src/antlr3basetree.c + src/antlr3bitset.c + src/antlr3collections.c + src/antlr3commontoken.c + src/antlr3commontreeadaptor.c + src/antlr3commontree.c + src/antlr3commontreenodestream.c + src/antlr3convertutf.c + src/antlr3cyclicdfa.c + src/antlr3debughandlers.c + src/antlr3encodings.c + src/antlr3exception.c + src/antlr3filestream.c + src/antlr3inputstream.c + src/antlr3intstream.c + src/antlr3lexer.c + src/antlr3parser.c + src/antlr3rewritestreams.c + src/antlr3string.c + src/antlr3tokenstream.c + src/antlr3treeparser.c +) + +set(ANTLR3_HDRS + include/antlr3baserecognizer.h + include/antlr3basetreeadaptor.h + include/antlr3basetree.h + include/antlr3bitset.h + include/antlr3collections.h + include/antlr3commontoken.h + include/antlr3commontreeadaptor.h + include/antlr3commontree.h + include/antlr3commontreenodestream.h + include/antlr3convertutf.h + include/antlr3cyclicdfa.h + include/antlr3debugeventlistener.h + include/antlr3defs.h + include/antlr3encodings.h + include/antlr3errors.h + include/antlr3exception.h + include/antlr3filestream.h + include/antlr3.h + include/antlr3input.h + include/antlr3interfaces.h + include/antlr3intstream.h + include/antlr3lexer.h + include/antlr3memory.h + include/antlr3parser.h + include/antlr3parsetree.h + include/antlr3recognizersharedstate.h + include/antlr3rewritestreams.h + include/antlr3string.h + include/antlr3tokenstream.h + include/antlr3treeparser.h + + antlr3config.h + ) + +include_directories(include ${CMAKE_CURRENT_BINARY_DIR}) + +add_library(libantlr3c STATIC ${ANTLR3_SRCS}) Added: branches/tora-trotl/src/libantlr3c-3.3/COPYING =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/COPYING (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/COPYING 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,29 @@ +// [The "BSD licence"] +// Copyright (c) 2005-2009 Jim Idle, Temporal Wave LLC +// http://www.temporal-wave.com +// http://www.linkedin.com/in/jimidle +// +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions +// are met: +// 1. Redistributions of source code must retain the above copyright +// notice, this list of conditions and the following disclaimer. +// 2. Redistributions in binary form must reproduce the above copyright +// notice, this list of conditions and the following disclaimer in the +// documentation and/or other materials provided with the distribution. +// 3. The name of the author may not be used to endorse or promote products +// derived from this software without specific prior written permission. +// +// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR +// IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +// OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. +// IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, +// INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT +// NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF +// THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + Added: branches/tora-trotl/src/libantlr3c-3.3/ChangeLog =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/ChangeLog (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/ChangeLog 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,550 @@ +The following changes (change numbers refer to perforce) were +made from version 3.1.1 to 3.1.2 + +Runtime +------- + +Change 5641 on 2009/02/20 by ji...@ji...tlr3 + + Release version 3.1.2 of the ANTLR C runtime. + + Updated documents and release notes will have to follow later. + +Change 5639 on 2009/02/20 by ji...@ji...tlr3 + + Fixed: ANTLR-356 + + Ensure that code generation for C++ does not require casts + +Change 5577 on 2009/02/12 by ji...@ji...tlr3 + + C Runtime - Bug fixes. + + o Having moved to use an extract directly from a vector for returning + tokens, it exposed a + bug whereby the EOF boudary calculation in tokLT was incorrectly + checking > rather than >=. + o Changing to API initialization of tokens rather than memcmp() + incorrectly forgot to set teh input stream pointer for the + manufactured tokens in the token factory; + o Rewrite streams for rewriting tree parsers did not check whether the + rewrite stream was ever assigned before trying to free it, it is now + in line with the ordinary parser code. + +Change 5576 on 2009/02/11 by ji...@ji...tlr3 + + C Runtime: Ensure that when we manufacture a new token for a missing + token, that the user suplied custom information (if any) is copied + from the current token. + +Change 5575 on 2009/02/08 by ji...@ji...tlr3 + + C Runtime - Vastly improve the reuse of allocated memory for nodes in + tree rewriting. + + A problem for all targets at the moment si that the rewrite logic + generated by ANTLR makes no attempt + to reuse any resources, it merely gurantees that the tree shape at the + end is correct. To some extent this is mitigated by the garbage + collection systems of Java and .Net, even thoguh it is still an overhead to + keep creating so many modes. + + This change implements the first of two C runtime changes that make + best efforst to track when a node has become orphaned and will never + be reused, based on inherent knowledge of the rewrite logic (which in + the long term is not a great soloution). + + Much of the rewrite logic consists of creating a niilnode into which + child nodes are appended. At: rulePost processing time; when a rewrite + stream is closed; and when becomeRoot is called, there are many situations + where the root of the tree that will be manipulted, or is finished with + (in the case of rewrtie streams), where the nilNode was just a temporary + creation for the sake of the rewrite itself. + + In these cases we can see that the nilNode would just be left ot rot in + the node factory that tracks all the tree nodes. + Rather than leave these in the factory to rot, we now keep a resuse + stck and always reuse any node on this + stack before claimin a new node from the factory pool. + + This single change alone reduces memory usage in the test case (20,604 + line C program and a GNU C parser) + from nearly a GB, to 276MB. This is still way more memory than we + shoudl need to do this operation, even on such a large input file, + but the reduction results in a huge performance increase and greatly + reduced system time spent on allocations. + + After this optimizatoin, comparison with gcc yeilds: + + time gcc -S a.c + a.c:1026: warning: conflicting types for built-in function ‘vsprintf’ + a.c:1030: warning: conflicting types for built-in function ‘vsnprintf’ + a.c:1041: warning: conflicting types for built-in function ‘vsscanf’ + 0.21user 0.01system 0:00.22elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+240outputs (0major+8345minor)pagefaults 0swaps + + and + + time ./jimi + Reading a.c + 0.28user 0.11system 0:00.39elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+0outputs (0major+66609minor)pagefaults 0swaps + + And we can now interpolate the fact that the only major differnce is + now the huge disparity in memory allocations. A + future optimization of vector pooling, to sepate node resue from vector + reuse, currently looks promising for further reuse of memory. + + Finally, a static analysis of the rewrte code, plus a realtime analysis + of the heap at runtime, may well give us a reasonable memory usage + pattern. In reality though, it is the generated rewrite logic + that must becom optional at not continuously rewriting things that it + need not, as it ascends the rule chain. + +Change 5563 on 2009/01/28 by ji...@ji...tlr3 + + Allow rewrite streams to use the base adaptors vector factory and not + try to malloc new vectors themselves. + +Change 5562 on 2009/01/28 by ji...@ji...tlr3 + + Don't use CALLOC to allocate tree pools, use malloc as there is no need + for calloc. + +Change 5561 on 2009/01/28 by ji...@ji...tlr3 + + Prevent warnigsn about retval.stop not being initialized when a rule + returns eraly because it is in backtracking mode + +Change 5558 on 2009/01/28 by ji...@ji...tlr3 + + Lots of optimizations (though the next one to be checked in is the huge + win) for AST building and vector factories. + + A large part of tree rewriting was the creation of vectors to hold AST + nodes. Although I had created a vector factory, for some reason I never got + around to creating a proper one, that pre-allocated the vectors in chunks and + so on. I guess I just forgot to. Hence a big win here is prevention of calling + malloc lots and lots of times to create vectors. + + A second inprovement was to change teh vector definition such that it + holds a certain number of elements wihtin the vector structure itself, rather + than malloc and freeing these. Currently this is set to 8, but may increase. + For AST construction, this is generally a big win because AST nodes don't often + have many individual children unless there has not been any shaping going on in + the parser. But if you are not shaping, then you don't really need a tree. + + Other perforamnce inprovements here include not calling functions + indirectly within token stream and common token stream. Hence tokens are + claimed directly from the vectors. Users can override these funcitons of course + and all this means is that if you override tokenstreams then you pretty much + have to provide all the mehtods, but then I think you woudl have to anyway (and + I don't know of anyone that has wanted to do this as you can carry your own + structure around with the tokens anyway and that is much easier). + +Change 5555 on 2009/01/26 by ji...@ji...tlr3 + + Fixed: ANTLR-288 + Correct the interpretation of the skip token such that channel, start + index, char pos in lie, start line and text are correctly reset to the start of + the new token when the one that we just traversed was marked as being skipped. + + This correctly excludes the text that was matched as part of the + SKIP()ed token from the next token in the token stream and so has the side + effect that asking for $text of a rule no longer includes the text that shuodl + be skipped, but DOES include the text of tokens that were merely placed off the + default channel. + +Change 5551 on 2009/01/25 by ji...@ji...tlr3 + + Fixed: ANTLR-287 + Most of the source files did not include the BSD license. THis might + not be that big a deal given that I don't care what people do with it + other than take my name off it, but having the license reproduced + everywhere + at least makes things perfectly clear. Hence this mass change of + sources and templates + to include the license. + +Change 5550 on 2009/01/25 by ji...@ji...tlr3 + + Fixed: ANTLR-365 + Ensure that as soon as we known about an input stream on the lexer that + we borrow its string factroy adn use it in our EOF token in case + anyone tries to make it a string, such as in error messages for + instance. + +Change 5548 on 2009/01/25 by ji...@ji...tlr3 + + Fixed: ANTLR-363 + At some point the Java runtime default changed from discarding offchannel + tokens to preserving them. The fix is to make the C runtime also + default to preserving off-channel tokens. + +Change 5544 on 2009/01/24 by ji...@ji...tlr3 + + Fixed: ANTLR-360 + Ensure that the fillBuffer funtiion does not call any methods + that require the cached buffer size to be recorded before we + have actually recorded it. + +Change 5543 on 2009/01/24 by ji...@ji...tlr3 + + Fixed: ANTLR-362 + Some users have started using string factories themselves and + exposed a flaw in the destroy method, that is intended to remove + a strng htat was created by the factory and is no longer needed. + The string was correctly removed from the vector that tracks them + but after the first one, all the remaining strings are then numbered + incorrectly. Hence the destroy method has been recoded to reindex + the strings in the factory after one is removed and everythig is once + more hunky dory. + User suggested fix rejected. + +Change 5542 on 2009/01/24 by ji...@ji...tlr3 + + Fixed ANTLR-366 + The recognizer state now ensures that all fields are set to NULL upon +creation + and the reset does not overwrite the tokenname array + +Change 5527 on 2009/01/15 by ji...@ji...tlr3 + + Add the C runtime for 3.1.2 beta2 to perforce + +Change 5526 on 2009/01/15 by ji...@ji...tlr3 + + Correctly define the MEMMOVE macro which was inadvertently left to be + memcpy. + +Change 5503 on 2008/12/12 by ji...@ji...tlr3 + + Change C runtime release number to 3.1.2 beta + +Change 5473 on 2008/12/01 by ji...@ji...tlr3 + + Fixed: ANTLR-350 - C runtime use of memcpy + Prior change to use memcpy instead of memmove in all cases missed the + fact that the string factory can be in a situation where overlaps occur. We now + have ANTLR3_MEMCPY and ANTLR3_MEMMOVE and use the two appropriately. + +Change 5471 on 2008/12/01 by ji...@ji...tlr3 + + Fixed ANTLR-361 + - Ensure that ANTLR3_BOOLEAN is typedef'ed correctly when building for + MingW + +Templates +--------- + +Change 5637 on 2009/02/20 by ji...@ji...tlr3 + + C rtunime - make sure that ADAPTOR results are cast to the tree type on + a rewrite + +Change 5620 on 2009/02/18 by ji...@ji...tlr3 + + Rename/Move: + From: //depot/code/antlr/main/src/org/antlr/codegen/templates/... + To: //depot/code/antlr/main/src/main/resources/org/antlr/codegen/templates/... + + Relocate the code generating templates to exist in the directory set + that maven expects. + + When checking in your templates, you may find it easiest to make a copy + of what you have, revert the change in perforce, then just check out the + template in the new location, and copy the changes back over. Nobody has oore + than two files open at the moment. + +Change 5578 on 2009/02/12 by ji...@ji...tlr3 + + Correct the string template escape sequences for generating scope + code in the C templates. + +Change 5577 on 2009/02/12 by ji...@ji...tlr3 + + C Runtime - Bug fixes. + + o Having moved to use an extract directly from a vector for returning + tokens, it exposed a + bug whereby the EOF boudary calculation in tokLT was incorrectly + checking > rather than + >=. + o Changing to API initialization of tokens rather than memcmp() + incorrectly forgot to + set teh input stream pointer for the manufactured tokens in the + token factory; + o Rewrite streams for rewriting tree parsers did not check whether the + rewrite stream + was ever assigned before trying to free it, it is now in line with + the ordinary parser code. + +Change 5567 on 2009/01/29 by ji...@ji...tlr3 + + C Runtime - Further Optimizations + + Within grammars that used scopes and were intended to parse large + inputs with many rule nests, + the creation anf deletion of the scopes themselves became significant. + Careful analysis shows that + for most grammars, while a parse could create and delete 20,000 scopes, + the maxium depth of + any scope was only 8. + + This change therefore changes the scope implementation so that it does + not free scope memory when + it is popped but just tracks it in a C runtime stack, eventually + freeing it when the stack is freed. This change + caused the allocation of only 12 scope structures instead of 20,000 for + the extreme example case. + + This change means that scope users must be carefule (as ever in C) to + initializae their scope elements + correctly as: + + 1) If not you may inherit values from a prior use of the scope + structure; + 2) SCope structure are now allocated with malloc and not calloc; + + Also, when using a custom free function to clean a scope when it is + popped, it is probably a good idea + to set any free'd pointers to NULL (this is generally good C programmig + practice in any case) + +Change 5566 on 2009/01/29 by ji...@ji...tlr3 + + Remove redundant BACKTRACK checking so that MSVC9 does not get confused + about possibly uninitialized variables + +Change 5565 on 2009/01/28 by ji...@ji...tlr3 + + Use malloc rather than calloc to allocate memory for new scopes. Note + that this means users will have to be careful to initialize any values in their + scopes that they expect to be 0 or NULL and I must document this. + +Change 5564 on 2009/01/28 by ji...@ji...tlr3 + + Use malloc rather than calloc for copying list lable tokens for + rewrites. + +Change 5561 on 2009/01/28 by ji...@ji...tlr3 + + Prevent warnigsn about retval.stop not being initialized when a rule + returns eraly because it is in backtracking mode + +Change 5560 on 2009/01/28 by ji...@ji...tlr3 + + Add a NULL check before freeing rewrite streams used in AST rewrites + rather than auto-rewrites. + + While the NULL check is redundant as the free cannot be called unless + it is assigned, Visual Studio C 2008 + gets it wrong and thinks that there is a PATH than can arrive at the + free wihtout it being assigned and that is too annoying to ignore. + +Change 5559 on 2009/01/28 by ji...@ji...tlr3 + + C target Tree rewrite optimization + + There is only one optimization in this change, but it is a huge one. + + The code generation templates were set up so that at the start of a rule, + any rewrite streams mentioned in the rule wer pre-created. However, this + is a massive overhead for rules where only one or two of the streams are + actually used, as we create them then free them without ever using them. + This was copied from the Java templates basically. + This caused literally millions of extra calls and vector allocations + in the case of the GNU C parser given to me for testing with a 20,000 + line program. + + After this change, the following comparison is avaiable against the gcc + compiler: + + Before (different machines here so use the relative difference for + comparison): + + gcc: + + real 0m0.425s + user 0m0.384s + sys 0m0.036s + + ANTLR C + real 0m1.958s + user 0m1.284s + sys 0m0.656s + + After the previous optimizations for vector pooling via a factory, + plus this huge win in removing redundant code, we have the following + (different machine to the one above): + + gcc: + 0.21user 0.01system 0:00.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+328outputs (0major+9922minor)pagefaults 0swaps + + ANTLR C: + + 0.37user 0.26system 0:00.64elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+0outputs (0major+130944minor)pagefaults 0swaps + + The extra system time coming from the fact that although the tree + rewriting is now optimal in terms of not allocating things it does + not need, there is still a lot more overhead in a parser that is generated + for generic use, including much more use of structures for tokens and extra + copying and so on. I will + continue to work on improviing things where I can, but the next big + improvement will come from Ter's optimization of the actual code structures we + generate including not doing things with rewrite streams that we do not need to + do at all. + + The second machine I used is about twice as fast CPU wise as the system + that was used originally by the user that asked about this performance. + +Change 5558 on 2009/01/28 by ji...@ji...tlr3 + + Lots of optimizations (though the next one to be checked in is the huge + win) for AST building and vector factories. + + A large part of tree rewriting was the creation of vectors to hold AST + nodes. Although I had created a vector factory, for some reason I never got + around to creating a proper one, that pre-allocated the vectors in chunks and + so on. I guess I just forgot to. Hence a big win here is prevention of calling + malloc lots and lots of times to create vectors. + + A second inprovement was to change teh vector definition such that it + holds a certain number of elements wihtin the vector structure itself, rather + than malloc and freeing these. Currently this is set to 8, but may increase. + For AST construction, this is generally a big win because AST nodes don't often + have many individual children unless there has not been any shaping going on in + the parser. But if you are not shaping, then you don't really need a tree. + + Other perforamnce inprovements here include not calling functions + indirectly within token stream and common token stream. Hence tokens are + claimed directly from the vectors. Users can override these funcitons of course + and all this means is that if you override tokenstreams then you pretty much + have to provide all the mehtods, but then I think you woudl have to anyway (and + I don't know of anyone that has wanted to do this as you can carry your own + structure around with the tokens anyway and that is much easier). + +Change 5554 on 2009/01/26 by ji...@ji...tlr3 + + Fixed: ANTLR-379 + For some reason in the past, the ruleMemozation() template had required + that the name parameter be set to the rule name. This does not seem to be a + requirement any more. The name=xxx override when invoking the template was + causing all the scope names derived when cleaning up in memoization to be + called after the rule name, which was not correct. Howver, this only affected + the output when in output=AST mode. + + This template invocation is now corrected. + +Change 5553 on 2009/01/26 by ji...@ji...tlr3 + + Fixed: ANTLR-330 + Managed to get the one rule that could not see the ASTLabelType to call + back in to the super template C.stg and ask it to construct hte name. I am not + 100% sure that this fixes all cases, but I cannot find any that fail. PLease + let me know if you find any exampoles of being unable to default the + ASTLabelType option in the C target. + +Change 5552 on 2009/01/25 by ji...@ji...tlr3 + + Progress: ANTLR-327 + Fix debug code generation templates when output=AST such that code + can at least be generated and I can debug the output code correctly. + Note that this checkin does not implement the debugging requirements + for tree generating parsers. + +Change 5551 on 2009/01/25 by ji...@ji...tlr3 + + Fixed: ANTLR-287 + Most of the source files did not include the BSD license. THis might + not be that big a deal given that I don't care what people do with it + other than take my name off it, but having the license reproduced + everywhere at least makes things perfectly clear. Hence this mass change of + sources and templates to include the license. + +Change 5549 on 2009/01/25 by ji...@ji...tlr3 + + Fixed: ANTLR-354 + Using 0.0D as the default initialize value for a double caused + VS 2003 C compiler to bomb out. There seesm to be no reason other + than force of habit to set this to 0.0D so I have dropped the D so + that older compilers do not complain. + +Change 5547 on 2009/01/25 by ji...@ji...tlr3 + + Fixed: ANTLR-282 + All references are now unadorned with any type of NULL check for the + following reasons: + + 1) A NULL reference means that there is a problem with the + grammar and we need the program to fail immediately so + that the programmer can work out where the problem occured; + 2) Most of the time, the only sensible value that can be + returned is NULL or 0 which + obviates the NULL check in the first place; + 3) If we replace a NULL reference with some value such as 0, + then the program may blithely continue but just do something + logically wrong, which will be very difficult for the + grammar programmer to detect and correct. + +Change 5545 on 2009/01/24 by ji...@ji...tlr3 + + Fixed: ANTLR-357 + The bug report was correct in that the types of references to things + like $start were being incorrectly cast as they wer not changed from + Java style casts (and the casts are unneccessary). this is now fixed + and references are referencing the correct, uncast, types. + However, the bug report was wrong in that the reference in the bok to + $start.pos will only work for Java and really, it is incorrect in the + book because it shoudl not access the .pos member directly but shudl + be using $start.getCharPositionInLine(). + Because there is no access qualification in C, one could use + $start.charPosition, however + really this should be $start->getCharPositionInLine($start); + +Change 5541 on 2009/01/24 by ji...@ji...tlr3 + + Fixed - ANTLR-367 + The code generation for the free method of a recognizer was not + distinguishing tree parsers from parsers when it came to calling delegate free + functions. + This is now corrected. + +Change 5540 on 2009/01/24 by ji...@ji...tlr3 + + Fixed ANTLR-355 + Ensure that we do not attempt to free any memory that we did not + actually allocate because the parser rule was being executed in + backtracking mode. + +Change 5539 on 2009/01/24 by ji...@ji...tlr3 + + Fixed: ANTLR-355 + When a C targetted parser is producing in backtracking mode, then the + creation of new stream rewrite structures shoudl not happen if the rule is + currently backtracking + +Change 5502 on 2008/12/11 by ji...@ji...tlr3 + + Fixed: ANTLR-349 Ensure that all marker labels in the lexer are 64 bit + compatible + +Change 5473 on 2008/12/01 by ji...@ji...tlr3 + + Fixed: ANTLR-350 - C runtime use of memcpy + Prior change to use memcpy instead of memmove in all cases missed the + fact that the string factory can be in a situation where overlaps occur. We now + have ANTLR3_MEMCPY and ANTLR3_MEMMOVE and use the two appropriately. + +Change 5387 on 2008/11/05 by parrt@parrt.spork + + Fixed x+=. issue with tree grammars; added unit test + +Change 5325 on 2008/10/23 by parrt@parrt.spork + + We were all ref'ing backtracking==0 hardcoded instead checking the + @synpredgate action. + + Added: branches/tora-trotl/src/libantlr3c-3.3/Makefile.am =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/Makefile.am (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/Makefile.am 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,81 @@ +AUTOMAKE_OPTIONS = gnu +AM_LIBTOOLFLAGS = +## --silent +ACLOCAL_AMFLAGS = -I m4 +lib_LTLIBRARIES = libantlr3c.la + +LIBSOURCES = src/antlr3baserecognizer.c \ + src/antlr3basetree.c \ + src/antlr3basetreeadaptor.c \ + src/antlr3bitset.c \ + src/antlr3collections.c \ + src/antlr3commontoken.c \ + src/antlr3commontree.c \ + src/antlr3commontreeadaptor.c \ + src/antlr3commontreenodestream.c \ + src/antlr3convertutf.c \ + src/antlr3cyclicdfa.c \ + src/antlr3debughandlers.c \ + src/antlr3encodings.c \ + src/antlr3exception.c \ + src/antlr3filestream.c \ + src/antlr3inputstream.c \ + src/antlr3intstream.c \ + src/antlr3lexer.c \ + src/antlr3parser.c \ + src/antlr3rewritestreams.c \ + src/antlr3string.c \ + src/antlr3stringstream.c \ + src/antlr3tokenstream.c \ + src/antlr3treeparser.c \ + src/antlr3ucs2inputstream.c + +libantlr3c_la_SOURCES = $(LIBSOURCES) + +include_HEADERS = include/antlr3.h \ + include/antlr3baserecognizer.h \ + include/antlr3basetree.h \ + include/antlr3basetreeadaptor.h \ + include/antlr3bitset.h \ + include/antlr3collections.h \ + include/antlr3commontoken.h \ + include/antlr3commontree.h \ + include/antlr3commontreeadaptor.h \ + include/antlr3commontreenodestream.h \ + include/antlr3convertutf.h \ + include/antlr3cyclicdfa.h \ + include/antlr3debugeventlistener.h \ + include/antlr3defs.h \ + include/antlr3encodings.h \ + include/antlr3errors.h \ + include/antlr3exception.h \ + include/antlr3filestream.h \ + include/antlr3input.h \ + include/antlr3interfaces.h \ + include/antlr3intstream.h \ + include/antlr3lexer.h \ + include/antlr3memory.h \ + include/antlr3parser.h \ + include/antlr3parsetree.h \ + include/antlr3recognizersharedstate.h \ + include/antlr3rewritestreams.h \ + include/antlr3string.h \ + include/antlr3stringstream.h \ + include/antlr3tokenstream.h \ + include/antlr3treeparser.h \ + antlr3config.h + +libantlr3c_la_LDFLAGS = -avoid-version + +INCLUDES = -Iinclude + +EXTRA_DIST = \ + vsrulefiles/antlr3lexerandparser.rules \ + vsrulefiles/antlr3lexer.rules \ + vsrulefiles/antlr3parser.rules \ + vsrulefiles/antlr3treeparser.rules \ + C.sln C.vcproj C.vcproj.vspscc \ + C.vssscc doxyfile doxygen + +export OBJECT_MODE + Added: branches/tora-trotl/src/libantlr3c-3.3/NEWS =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/NEWS (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/NEWS 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,2 @@ +See www.antlr.org and the associated email forums for release dates and +other announcements. Added: branches/tora-trotl/src/libantlr3c-3.3/README =================================================================== --- branches/tora-trotl/src/libantlr3c-3.3/README (rev 0) +++ branches/tora-trotl/src/libantlr3c-3.3/README 2011-05-14 11:12:04 UTC (rev 3962) @@ -0,0 +1,1924 @@ +ANTLR v3.0.1 C Runtime +ANTLR 3.0.1 +January 1, 2008 + +At the moment, the use of the C runtime engine for the parser is not generally +for the inexperienced C programmer. However this is mainly because of the lack +of documentation on use, which will be corrected shortly. The C runtime +code itself is however well documented with doxygen style comments and a +reasonably experienced C programmer should be able to piece it together. You +can visit the documentation at: http://www.antlr.org/api/C/index.html + +The general make up is that everything is implemented as a pseudo class/object +initialized with pointers to its 'member' functions and data. All objects are +(usually) created by factories, which auto manage the memory allocation and +release and generally make life easier. If you remember this rule, everything +should fall in to place. + +Jim Idle - Portland Oregon, Jan 2008 +jimi idle ws + +=============================================================================== + +Terence Parr, parrt at cs usfca edu +ANTLR project lead and supreme dictator for life +University of San Francisco + +INTRODUCTION + +Welcome to ANTLR v3! I've been working on this for nearly 4 years and it's +almost ready! I plan no feature additions between this beta and first +3.0 release. I have lots of features to add later, but this will be +the first set. Ultimately, I need to rewrite ANTLR v3 in itself (it's +written in 2.7.7 at the moment and also needs StringTemplate 3.0 or +later). + +You should use v3 in conjunction with ANTLRWorks: + + http://www.antlr.org/works/index.html + +WARNING: We have bits of documentation started, but nothing super-complete +yet. The book will be printed May 2007: + +http://www.pragmaticprogrammer.com/titles/tpantlr/index.html + +but we should have a beta PDF available on that page in Feb 2007. + +You also have the examples plus the source to guide you. + +See the new wiki FAQ: + + http://www.antlr.org/wiki/display/ANTLR3/ANTLR+v3+FAQ + +and general doc root: + + http://www.antlr.org/wiki/display/ANTLR3/ANTLR+3+Wiki+Home + +Please help add/update FAQ entries. + +I have made very little effort at this point to deal well with +erroneous input (e.g., bad syntax might make ANTLR crash). I will clean +this up after I've rewritten v3 in v3. + +Per the license in LICENSE.txt, this software is not guaranteed to +work and might even destroy all life on this planet: + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR +IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, +STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING +IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +EXAMPLES + +ANTLR v3 sample grammars: + + http://www.antlr.org/download/examples-v3.tar.gz + +contains the following examples: LL-star, cminus, dynamic-scope, +fuzzy, hoistedPredicates, island-grammar, java, python, scopes, +simplecTreeParser, treeparser, tweak, xmlLexer. + +Also check out Mantra Programming Language for a prototype (work in +progress) using v3: + + http://www.linguamantra.org/ + +---------------------------------------------------------------------- + +What is ANTLR? + +ANTLR stands for (AN)other (T)ool for (L)anguage (R)ecognition and was +originally known as PCCTS. ANTLR is a language tool that provides a +framework for constructing recognizers, compilers, and translators +from grammatical descriptions containing actions. Target language list: + +http://www.antlr.org/wiki/display/ANTLR3/Code+Generation+Targets + +---------------------------------------------------------------------- + +How is ANTLR v3 different than ANTLR v2? + +See migration guide: + http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3 + +ANTLR v3 has a far superior parsing algorithm called LL(*) that +handles many more grammars than v2 does. In practice, it means you +can throw almost any grammar at ANTLR that is non-left-recursive and +unambiguous (same input can be matched by multiple rules); the cost is +perhaps a tiny bit of backtracking, but with a DFA not a full parser. +You can manually set the max lookahead k as an option for any decision +though. The LL(*) algorithm ramps up to use more lookahead when it +needs to and is much more efficient than normal LL backtracking. There +is support for syntactic predicate (full LL backtracking) when LL(*) +fails. + +Lexers are much easier due to the LL(*) algorithm as well. Previously +these two lexer rules would cause trouble because ANTLR couldn't +distinguish between them with finite lookahead to see the decimal +point: + +INT : ('0'..'9')+ ; +FLOAT : INT '.' INT ; + +The syntax is almost identical for features in common, but you should +note that labels are always '=' not ':'. So do id=ID not id:ID. + +You can do combined lexer/parser grammars again (ala PCCTS) both lexer +and parser rules are defined in the same file. See the examples. +Really nice. You can reference strings and characters in the grammar +and ANTLR will generate the lexer for you. + +The attribute structure has been enhanced. Rules may have multiple +return values, for example. Further, there are dynamically scoped +attributes whereby a rule may define a value usable by any rule it +invokes directly or indirectly w/o having to pass a parameter all the +way down. + +ANTLR v3 tree construction is far superior--it provides tree rewrite +rules where the right hand side is simply the tree grammar fragment +describing the tree you want to build: + +formalArgs + : typename declarator (',' typename declarator )* + -> ^(ARG typename declarator)+ + ; + +That builds tree sequences like: + +^(ARG int v1) ^(ARG int v2) + +ANTLR v3 also incorporates StringTemplate: + + http://www.stringtemplate.org + +just like AST support. It is useful for generating output. For +example this rule creates a template called 'import' for each import +definition found in the input stream: + +grammar Java; +options { + output=template; +} +... +importDefinition + : 'import' identifierStar SEMI + -> import(name={$identifierStar.st}, + begin={$identifierStar.start}, + end={$identifierStar.stop}) + ; + +The attributes are set via assignments in the argument list. The +arguments are actions with arbitrary expressions in the target +language. The .st label property is the result template from a rule +reference. There is a nice shorthand in actions too: + + %foo(a={},b={},...) ctor + %({name-expr})(a={},...) indirect template ctor reference + %{string-expr} anonymous template from string expr + %{expr}.y = z; template attribute y of StringTemplate-typed expr to z + %x.y = z; set template attribute y of x (always set never get attr) + to z [languages like python without ';' must still use the + ';' which the code generator is free to remove during code gen] + Same as '(x).setAttribute("y", z);' + +For ANTLR v3 I decided to make the most common tasks easy by default +rather. This means that some of the basic objects are heavier weight +than some speed demons would like, but they are free to pare it down +leaving most programmers the luxury of having it "just work." For +example, to read in some input, tweak it, and write it back out +preserving whitespace, is easy in v3. + +The ANTLR source code is much prettier. You'll also note that the +run-time classes are conveniently encapsulated in the +org.antlr.runtime package. + +---------------------------------------------------------------------- + +How do I install this damn thing? + +Just untar and you'll get: + +antlr-3.0b6/README.txt (this file) +antlr-3.0b6/LICENSE.txt +antlr-3.0b6/src/org/antlr/... +antlr-3.0b6/lib/stringtemplate-3.0.jar (3.0b6 needs 3.0) +antlr-3.0b6/lib/antlr-2.7.7.jar +antlr-3.0b6/lib/antlr-3.0b6.jar + +Then you need to add all the jars in lib to your CLASSPATH. + +---------------------------------------------------------------------- + +How do I use ANTLR v3? + +[I am assuming you are only using the command-line (and not the +ANTLRWorks GUI)]. + +Running ANTLR with no parameters shows you: + +ANTLR Parser Generator Early Access Version 3.0b6 (Jan 31, 2007) 1989-2007 +usage: java org.antlr.Tool [args] file.g [file2.g file3.g ...] + -o outputDir specify output directory where all output is generated + -lib dir specify location of token files + -report print out a report about the grammar(s) processed + -print print out the grammar without actions + -debug generate a parser that emits debugging events + -profile generate a parser that computes profiling information + -nfa generate an NFA for each rule + -dfa generate a DFA for each decision point + -message-format name specify output style for messages + -X display extended argument list + +For example, consider how to make the LL-star example from the examples +tarball you can get at http://www.antlr.org/download/examples-v3.tar.gz + +$ cd examples/java/LL-star +$ java org.antlr.Tool simplec.g +$ jikes *.java + +For input: + +char c; +int x; +void bar(int x); +int foo(int y, char d) { + int i; + for (i=0; i<3; i=i+1) { + x=3; + y=5; + } +} + +you will see output as follows: + +$ java Main input +bar is a declaration +foo is a definition + +What if I want to test my parser without generating code? Easy. Just +run ANTLR in interpreter mode. It can't execute your actions, but it +can create a parse tree from your input to show you how it would be +matched. Use the org.antlr.tool.Interp main class. In the following, +I interpret simplec.g on t.c, which contains "int x;" + +$ java org.antlr.tool.Interp simplec.g WS program t.c +( <grammar SimpleC> + ( program + ( declaration + ( variable + ( type [@0,0:2='int',<14>,1:0] ) + ( declarator [@2,4:4='x',<2>,1:4] ) + [@3,5:5=';',<5>,1:5] + ) + ) + ) +) + +where I have formatted the output to make it more readable. I have +told it to ignore all WS tokens. + +---------------------------------------------------------------------- + +How do I rebuild ANTLR v3? + +Make sure the following two jars are in your CLASSPATH + +antlr-3.0b6/lib/stringtemplate-3.0.jar +antlr-3.0b6/lib/antlr-2.7.7.jar +junit.jar [if you want to build the test directories] + +then jump into antlr-3.0b6/src directory and then type: + +$ javac -d . org/antlr/Tool.java org/antlr/*/*.java org/antlr/*/*/*.java + +Takes 9 seconds on my 1Ghz laptop or 4 seconds with jikes. Later I'll +have a real build mechanism, though I must admit the one-liner appeals +to me. I use Intellij so I never type anything actually to build. + +There is also an ANT build.xml file, but I know nothing of ANT; contributed +by others (I'm opposed to any tool with an XML interface for Humans). + +----------------------------------------------------------------------- +C# Target Notes + +1. Auto-generated lexers do not inherit parent parser's @namespace + {...} value. Use @lexer::namespace{...}. + +----------------------------------------------------------------------- + +CHANGES + +March 17, 2007 + +* Jonathan DeKlotz updated C# templates to be 3.0b6 current + +March 14, 2007 + +* Manually-specified (...)=> force backtracking eval of that predicate. + backtracking=true mode does not however. Added unit test. + +March 14, 2007 + +* Fixed bug in lexer where ~T didn't compute the set from rule T. + +* Added -Xnoinlinedfa make all DFA with tables; no inline prediction with IFs + +* Fixed http://www.antlr.org:8888/browse/ANTLR-80. + Sem pred states didn't define lookahead vars. + +* Fixed http://www.antlr.org:8888/browse/ANTLR-91. + When forcing some acyclic DFA to be state tables, they broke. + Forcing all DFA to be state tables should give same results. + +March 12, 2007 + +* setTokenSource in CommonTokenStream didn't clear tokens list. + setCharStream calls reset in Lexer. + +* Altered -depend. No longer printing grammar files for multiple input + files with -depend. Doesn't show T__.g temp file anymore. Added + TLexer.tokens. Added .h files if defined. + +February 11, 2007 + +* Added -depend command-line option that, instead of processing files, + it shows you what files the input grammar(s) depend on and what files + they generate. For combined grammar T.g: + + $ java org.antlr.Tool -depend T.g + + You get: + + TParser.java : T.g + T.tokens : T.g + T__.g : T.g + + Now, assuming U.g is a tree grammar ref'd T's tokens: + + $ java org.antlr.Tool -depend T.g U.g + + TParser.java : T.g + T.tokens : T.g + T__.g : T.g + U.g: T.tokens + U.java : U.g + U.tokens : U.g + + Handles spaces by escaping them. Pays attention to -o, -fo and -lib. + Dir 'x y' is a valid dir in current dir. + + $ java org.antlr.Tool -depend -lib /usr/local/lib -o 'x y' T.g U.g + x\ y/TParser.java : T.g + x\ y/T.tokens : T.g + x\ y/T__.g : T.g + U.g: /usr/local/lib/T.tokens + x\ y/U.java : U.g + x\ y/U.tokens : U.g + + You have API access via org.antlr.tool.BuildDependencyGenerator class: + getGeneratedFileList(), getDependenciesFileList(). You can also access + the output template: getDependencies(). The file + org/antlr/tool/templates/depend.stg contains the template. You can + modify as you want. File objects go in so you can play with path etc... + +February 10, 2007 + +* no more .gl files generated. All .g all the time. + +* changed @finally to be @after and added a finally clause to the + exception stuff. I also removed the superfluous "exception" + keyword. Here's what the new syntax looks like: + + a + @after { System.out.println("ick"); } + : 'a' + ; + catch[RecognitionException e] { System.out.println("foo"); } + catch[IOException e] { System.out.println("io"); } + finally { System.out.println("foobar"); } + + @after executes after bookkeeping to set $rule.stop, $rule.tree but + before scopes pop and any memoization happens. Dynamic scopes and + memoization are still in generated finally block because they must + exec even if error in rule. The @after action and tree setting + stuff can technically be skipped upon syntax error in rule. [Later + we might add something to finally to stick an ERROR token in the + tree and set the return value.] Sequence goes: set $stop, $tree (if + any), @after (if any), pop scopes (if any), memoize (if needed), + grammar finally clause. Last 3 are in generated code's finally + clause. + +3.0b6 - January 31, 2007 + +January 30, 2007 + +* Fixed bug in IntervalSet.and: it returned the same empty set all the time + rather than new empty set. Code altered the same empty set. + +* Made analysis terminate faster upon a decision that takes too long; + it seemed to keep doing work for a while. Refactored some names + and updated comments. Also made it terminate when it realizes it's + non-LL(*) due to recursion. just added terminate conditions to loop + in convert(). + +* Sometimes fatal non-LL(*) messages didn't appear; instead you got + "antlr couldn't analyze", which is actually untrue. I had the + order of some prints wrong in the DecisionProbe. + +* The code generator incorrectly detected when it could use a fixed, + acyclic inline DFA (i.e., using an IF). Upon non-LL(*) decisions + with predicates, analysis made cyclic DFA. But this stops + the computation detecting whether they are cyclic. I just added + a protection in front of the acyclic DFA generator to avoid if + non-LL(*). Updated comments. + +January 23, 2007 + +* Made tree node streams use adaptor to create navigation nodes. + Thanks to Emond Papegaaij. + +January 22, 2007 + +* Added lexer rule properties: start, stop + +January 1, 2007 + +* analysis failsafe is back on; if a decision takes too long, it bails out + and uses k=1 + +January 1, 2007 + +* += labels for rules only work for output option; previously elements + of list were the return value structs, but are now either the tree or + StringTemplate return value. You can label different rules now + x+=a x+=b. + +December 30, 2006 + +* Allow \" to work correctly in "..." template. + +December 28, 2006 + +* errors that are now warnings: missing AST label type in trees. + Also "no start rule detected" is warning. + +* tree grammars also can do rewrite=true for output=template. + Only works for alts with single node or tree as alt elements. + If you are going to use $text in a tree grammar or do rewrite=true + for templates, you must use in your main: + + nodes.setTokenStream(tokens); + +* You get a warning for tree grammars that do rewrite=true and + output=template and have -> for alts that are not simple nodes + or simple trees. new unit tests in TestRewriteTemplates at end. + +December 27, 2006 + +* Error message appears when you use -> in tree grammar with + output=template and rewrite=true for alt that is not simple + node or tree ref. + +* no more $stop attribute for tree parsers; meaningless/useless. + Removed from TreeRuleReturnScope also. + +* rule text attribute in tree parser must pull from token buffer. + Makes no sense otherwise. added getTokenStream to TreeNodeStream + so rule $text attr works. CommonTreeNodeStream etc... now let + you set the token stream so you can access later from tree parser. + $text is not well-defined for rules like + + slist : stat+ ; + + because stat is not a single node nor rooted with a single node. + $slist.text will get only first stat. I need to add a warning about + this... + +* Fixed http://www.antlr.org:8888/browse/ANTLR-76 for Java. + Enhanced TokenRewriteStream so it accepts any object; converts + to string at last second. Allows you to rewrite with StringTemplate + templates now :) + +* added rewrite option that makes -> template rewrites do replace ops for + TokenRewriteStream input stream. In output=template and rewrite=true mode + same as before 'cept that the parser does + + ((TokenRewriteStream)input).replace( + ((Token)retval.start).getTokenIndex(), + input.LT(-1).getTokenIndex(), + retval.st); + + after each rewrite so that the input stream is altered. Later refs to + $text will have rewrites. Here's a sample test program for grammar Rew. + + FileReader groupFileR = new FileReader("Rew.stg"); + StringTemplateGroup templates = new StringTemplateGroup(groupFileR); + ANTLRInputStream input = new ANTLRInputStream(System.in); + RewLexer lexer = new RewLexer(input); + TokenRewriteStream tokens = new TokenRewriteStream(lexer); + RewParser parser = new RewParser(tokens); + parser.setTemplateLib(templates); + parser.program(); + System.out.println(tokens.toString()); + groupFileR.close(); + +December 26, 2006 + +* BaseTree.dupTree didn't dup recursively. + +December 24, 2006 + +* Cleaned up some comments and removed field treeNode + from MismatchedTreeNodeException class. It is "node" in + RecognitionException. + +* Changed type from Object to BitSet for expecting fields in + MismatchedSetException and MismatchedNotSetException + +* Cleaned up error printing in lexers and the messages that it creates. + +* Added this to TreeAdaptor: + /** Return the token object from which this node was created. + * Currently used only for printing an error message. + * The error display routine in BaseRecognizer needs to + * display where the input the error occurred. If your + * tree of limitation does not store information that can + * lead you to the token, you can create a token filled with + * the appropriate information and pass that back. See + * BaseRecognizer.getErrorMessage(). + */ + public Token getToken(Object t); + +December 23, 2006 + +* made BaseRecognizer.displayRecognitionError nonstatic so people can + override it. Not sure why it was static before. + +* Removed state/decision message that comes out of no + viable alternative exceptions, as that was too much. + removed the decision number from the early exit exception + also. During development, you can simply override + displayRecognitionError from BaseRecognizer to add the stuff + back in if you want. + +* made output go to an output method you can override: emitErrorMessage() + +* general cleanup of the error emitting code in BaseRecognizer. Lots + more stuff you can override: getErrorHeader, getTokenErrorDisplay, + emitErrorMessage, getErrorMessage. + +December 22, ... [truncated message content] |