Revision: 5167
http://oorexx.svn.sourceforge.net/oorexx/?rev=5167&view=rev
Author: jfaucher
Date: 2009-09-13 21:17:27 +0000 (Sun, 13 Sep 2009)
Log Message:
-----------
Parse syntax diagrams, generate PNG and reference these PNG from the doc.
For an example of generated PDF, go to :
http://sites.google.com/site/jfaucherfr/oorexx/syntax-diagrams
Modified Paths:
--------------
incubator/DocMusings/makevalidxml/_diary.txt
incubator/DocMusings/makevalidxml/directory.rex
incubator/DocMusings/makevalidxml/myxmlparser.cls
incubator/DocMusings/makevalidxml/transformdir.rex
incubator/DocMusings/makevalidxml/transformfile.rex
incubator/DocMusings/railroad/_diary.txt
incubator/DocMusings/railroad/declarations.xml
incubator/DocMusings/railroad/syntaxdiagram2svg/css/syntaxdiagram.css
incubator/DocMusings/railroad/syntaxdiagram2svg/js/constants.js
incubator/DocMusings/railroad/syntaxdiagram2svg/resource/constants.xml
incubator/DocMusings/railroad/syntaxdiagram2svg/transform.xsl
Added Paths:
-----------
incubator/DocMusings/makevalidxml/arguments.cls
incubator/DocMusings/makevalidxml/indentedstream.cls
incubator/DocMusings/makevalidxml/sd2image.rex
incubator/DocMusings/makevalidxml/sdparser.cls
incubator/DocMusings/makevalidxml/sdtokenizer.cls
incubator/DocMusings/makevalidxml/sdxmlizer.cls
incubator/DocMusings/makevalidxml/trace.cls
Removed Paths:
-------------
incubator/DocMusings/makevalidxml/arguments.rex
Modified: incubator/DocMusings/makevalidxml/_diary.txt
===================================================================
--- incubator/DocMusings/makevalidxml/_diary.txt 2009-09-13 20:28:12 UTC (rev 5166)
+++ incubator/DocMusings/makevalidxml/_diary.txt 2009-09-13 21:17:27 UTC (rev 5167)
@@ -1,4 +1,157 @@
===============================================================================
+2009 september 12
+
+Generated the 4.0.0 documentation with syntax diagrams as image.
+Configuration :
+WinXP + Cygwin + OpenJade + TexLive
+For an example of generated PDF, go to :
+http://sites.google.com/site/jfaucherfr/oorexx/syntax-diagrams
+
+Procedure :
+Go to directory docs/releases/4.0.0
+Copy the directory trunk to trunk.out
+Apply the fixes described below (9 aug and 5 sept ) in trunk
+Run : transformdir -syntdiag trunk trunk.out (25 min on my laptop).
+Go to directory trunk.out and make the doc as usual.
+The syntax diagrams that can't be converted remain as text.
+
+Remember :
+There are errors remaining in oodialog : userdialog.sgml and utilityclasses.
+The files windowBaseCommon.sgml and windowExtensionsCommon.sgml are not processed.
+
+
+Todo :
+Improve the quality of the image, the PNG looks a little bit blurred.
+Two image formats are generated from SVG : PDF and PNG
+PDF images are very good, even if viewed at 400%.
+The transformed XML contains a reference to each format, but the references
+to the PDF are not taken into account.
+
+Todo :
+Implement -dsssl, -xslt and -xinclude.
+
+
+===============================================================================
+2009 september 5
+
+Now process recursively the directories.
+
+Added option -syntdiag
+When the input <file>.sgml contains at least one syntax diagram, a file
+sd_<file>.xml is created, which contains (for each syntax diagram) the text of
+the syntax diagram, a diagnostic and (if valid) a DITA XML representation.
+
+To be recognized as a syntax diagram, the text must be in CDATA and must contain
+at least one entry point ">>" or "|-" at the begining of line.
+
+Some errors are detected during the tokenization or the parsing :
+oodialog\basedialog.sgml
+ line 1837, near aBaseDialog~AutoDetection : must be terminated by ><
+ line 2662, near aBaseDialog~ConnectMouseCapture : must be terminated by ><
+ line 2801, near >>-aBaseDialog~ConnectKeyPress( : must shift the line one character to the right
+oodialog\listcontrolc.sgml :
+ line 1862, near aListControl~ModifyColumn : replace "." by "+"
+oodialog\standarddialog.sgml :
+ line 856, near >>-aCheckList~Init( : the second >> must be a continuation >-
+ line 1206, near FileNameDialog( : replace the first "+" by "-"
+ line 1664, near >>--CheckList( : the second >> must be a continuation >-
+oodialog\userdialog.sgml :
+ line 6593, near >>--addInput( : a "+" is missing for "+-,-style-+"
+ line 6896, near >>--addInputGroup( : a "+" is missing for "+-,-style-+"
+ line 7391, near >>--addInputGroup( : a "+" is missing for "+-,-style-+"
+ and the same error which occurs several times : wrong optional declaration (I think), not fixed :
+ line 1056, near dlg~addGroupBox(
+ line 2337, near addRadioGroup(
+ line 2576, near addRadioStem(
+ line 3117, near addCheckGroup(
+ line 3368, near addCheckBoxStem(
+ line 3694, near dlg~addStatic(
+ line 4543, near addWhiteRect(-
+ line 4659, near addWhiteFrame(
+ line 4775, near addGrayRect(
+ line 4891, near addGrayFrame(
+ line 5007, near addBlackRect(
+ line 5123, near addBlackFrame(
+ line 5239, near addEtchedFramed(
+ line 5354, near addEtchedHorizontal(
+ line 5469, near addEtchedVertical(
+oodialog\utilityclasses.sgml :
+ line 241, near >>-.DlgUtil~version( : two "+" are missing
+ line 315, near >>-.DlgUtil~comCtl32Version( : two "+" are missing
+ line 2121, near aDlgAreaU~Init(Dialog : this sd looks strange, don't know how to fix that...
+rexxref\collclasses.sgml :
+ line 1314, near makeString(-+ :
+ replace the second "+" by "-" and move the end of the choice (CHAR) to the same position as (LINE)
+ line 1676, near toString(-+ :
+ replace the second "+" by "-" and move the end of the choice (CHAR) to the same position as (LINE)
+rexxref\dire.sgml :
+ line 362, near >>-::CONSTANT : must be terminated by ><
+rexxref\funct.sgml :
+ line 3431, near >>-RANDOM( : the final "+" of "+--min--+" must be shifted one character to the right
+ line 4937, near >>-TRANSLATE( :
+ the closing parenthesis is not at the right place, must be after the last parameter
+ the three lines of the last block "+-pos-+ +-,length-+" must be moved one line upper
+ (to be similar to string~translate)
+ BUT... Seems still not good to me : TRANSLATE("abcdef", , , , 2, 3) is NOT allowed by the diagram
+ pos and length should be declared optional in the continuation of pad (I think)
+rexxref\fundclasses.sgml :
+ line 2719, near >>-caselessCompareTo( : a " " must be replaced by "-"
+ line 2881, near >>-caselessMatch( : the last "+" must be shifted to the right, to align it with the upper "+"
+ line 3189, near >>-compareTo( : a " " must be replaced by "-"
+ line 4097, near >>-match( : the last "+" must be shifted to the right, to align it with the upper "+"
+ line 4169, near >>-max-+ : replace the "." by "+" (or the "+" by ".")
+ line 4199, near >>-min-+ : replace the "." by "+" (or the "+" by ".")
+ line 5603, near >>-call-+ : replace the "." by "+" (or the "+" by ".")
+ line 4669, near >>-translate : the last "+" must be shifted to the right, to align it with the upper "+"
+ This syntax diagram seems not good : translate(, , , 3, 2) is NOT allowed by the diagram
+rexxref\instrc.sgml :
+ line 388, near >>-CALL : must end with ><
+ line 390, near >>-CALL : must align the end choice with the upper "+"
+rexxref\intro.sgml :
+ line 1900, near >>-receiver-+ : replace "+" by "." (or "." by "+")
+rexxref\preface.sgml :
+ line 163, near |--expansion : this syntax diagram must be put in CDATA to be recognized.
+rexxref\rexutil.sgml :
+ line 2029, near >>-SysGetMessage( :
+ not supported, the 2nd "+" has no counterpart (must have a -+-, not just a -+)
+ solution : do a clean separation for each end of choice
+ line 2035, near >>-SysGetMessage( : the 2nd "+" must be shifted one character left, to be aligned
+ line 2086, near >>-SysGetMessageX( :
+ not supported, the 2nd "+" has no counterpart (must have a -+-, not just a -+)
+ solution : do a clean separation for each end of choice
+ line 2212, near 2nd >>-SysIni( : the line must be shifted 10 characters to the left
+ line 3366, near >>-SysShutdownSystem( :
+ replace "*" by "+"
+ not supported, the 2nd "+" has no counterpart (must have a -+-, not just a -+)
+ solution : do a clean separation for each end of choice
+rexxref\utilityclasses.sgml :
+ line 141, near >>-fromNormalDate( : the 3rd "+" has no counterpart
+ line 169, near >>-fromEuropeanDate( : the 3rd "+" has no counterpart
+ line 199, near >>-fromOrderedDate( : the 3rd "+" has no counterpart
+ line 228, near >>-fromStandardDate( : the 3rd "+" has no counterpart
+ line 256, near >>-fromUsaDate( : the 3rd "+" has no counterpart
+ line 446, near >>-init(year,month,day-+ : the last "+" must be replaced by ")"
+ line 3303, near >>-caselessMatch( : the last "+" must be shifted to the right, to be aligned
+ line 3718, near >>-match( : the last "+" must be shifted to the right, to be aligned
+ line 3993, near >>-translate : the last "+" must be shifted to the right, to align it with the upper "+"
+ This syntax diagram seems not good (same problem as string~translate)
+rxftp\rxftp :
+ line 1096, near myftpobj~FtpSetUser( : the 2nd "+" has no counterpart
+
+Remember, fragments :
+rexxref\funct.sgml : Stream commands
+rexxref\instrc.sgml : DO repetitor, conditional
+rexxref\instrc.sgml : LOOP repetitor, conditional
+rexxref\instrc.sgml : RAISE options, options exit
+rexxref\streamclasses.sgml : Stream commands
+rexxref\streamclasses.sgml : Stream open
+winextensions\winregistry.sgml : open access
+
+
+Todo : Implement -dsssl, -xslt and -xinclude.
+
+
+===============================================================================
2009 august 09
Renamed makevalidxml to transformfile because the script is not limited to
@@ -38,7 +191,6 @@
line 4294 : <</para>
rexxpg\classicapi.sgml
line 3042 : missing closing ">" : </title
- line 3105 : remove this tag <section>, it has no counterpart
rexxref\instrc.sgml
line 3576 : Use < instead of "<" in <computeroutput><=</computeroutput>
winextensions\windowmanager.sgml
Copied: incubator/DocMusings/makevalidxml/arguments.cls (from rev 5081, incubator/DocMusings/makevalidxml/arguments.rex)
===================================================================
--- incubator/DocMusings/makevalidxml/arguments.cls (rev 0)
+++ incubator/DocMusings/makevalidxml/arguments.cls 2009-09-13 21:17:27 UTC (rev 5167)
@@ -0,0 +1,77 @@
+::requires 'string2args.rex'
+
+::class CommonArguments public subclass Directory
+::attribute args
+
+::method init
+ self~setMethod("UNKNOWN", "return ''") -- I want '' instead of .nil when accessing an unset entry
+ self~debug = .false
+ self~dsssl = .false
+ self~dump = .false
+ self~errors = .List~new
+ self~help = .false
+ self~syntdiag = .false
+ self~xinclude = .false
+ self~xslt = .false
+
+ -- Tokenize the arguments, if needed
+ use strict arg callType, arguments -- always an array
+ select
+ when callType == "COMMAND" & arguments~items == 1 then self~args = String2Args(arguments[1])
+ when callType == "SUBROUTINE" & arguments~items == 1 & arguments[1]~isA(.array) then self~args = arguments[1]
+ otherwise self~args = arguments
+ end
+
+ -- Use makeArray to have a non-sparse array,
+ -- because omitted parameters have no corresponding index,
+ -- and we ignore omitted parameters here.
+ loop i=1 to self~args~items
+ if self~args[i] == "" then self~args~remove(i)
+ end
+ self~args = self~args~makeArray
+
+ if self~args~items == 0 then do
+ self~help = .true
+ return
+ end
+
+::method parseOption
+ use strict arg option
+ select
+ when "-debug"~caseLessEquals(option) then do
+ self~debug = .true
+ self~debugOption = option
+ end
+ when "-dsssl"~caseLessEquals(option) then do
+ self~dsssl = .true
+ self~dssslOption = option
+ end
+ when "-dump"~caseLessEquals(option) then do
+ self~dump = .true
+ self~dumpOption = option
+ end
+ when "-help"~caseLessEquals(option) then do
+ self~help = .true
+ self~helpOption = option
+ end
+ when "-syntdiag"~caseLessEquals(option) then do
+ self~syntdiag = .true
+ self~syntdiagOption = option
+ end
+ when "-xinclude"~caseLessEquals(option) then do
+ self~xinclude = .true
+ self~xincludeOption = option
+ end
+ when "-xslt"~caseLessEquals(option) then do
+ self~xslt = .true
+ self~xsltOption = option
+ end
+ otherwise return 0
+ end
+ return 1
+
+::method verifyOptions
+ self~errors~empty
+ if self~dsssl & self~xslt then self~errors~append("[error] You can't specify both "self~dssslOption" and "self~xsltOption)
+ if self~xinclude & \self~xslt then self~errors~append("[error] You must specify -xslt if you want to use "self~xincludeOption)
+ return self~errors~isEmpty
Property changes on: incubator/DocMusings/makevalidxml/arguments.cls
___________________________________________________________________
Added: svn:mergeinfo
+
Deleted: incubator/DocMusings/makevalidxml/arguments.rex
===================================================================
--- incubator/DocMusings/makevalidxml/arguments.rex 2009-09-13 20:28:12 UTC (rev 5166)
+++ incubator/DocMusings/makevalidxml/arguments.rex 2009-09-13 21:17:27 UTC (rev 5167)
@@ -1,72 +0,0 @@
-::requires 'string2args.rex'
-
-::class CommonArguments public subclass Directory
-::attribute args
-
-::method init
- self~setMethod("UNKNOWN", "return ''") -- I want '' instead of .nil when accessing an unset entry
- self~debug = .false
- self~dsssl = .false
- self~dump = .false
- self~errors = .List~new
- self~help = .false
- self~xinclude = .false
- self~xslt = .false
-
- -- Tokenize the arguments, if needed
- use strict arg callType, arguments -- always an array
- select
- when callType == "COMMAND" & arguments~items == 1 then self~args = String2Args(arguments[1])
- when callType == "SUBROUTINE" & arguments~items == 1 & arguments[1]~isA(.array) then self~args = arguments[1]
- otherwise self~args = arguments
- end
-
- -- Use makeArray to have a non-sparse array,
- -- because omitted parameters have no corresponding index,
- -- and we ignore omitted parameters here.
- loop i=1 to self~args~items
- if self~args[i] == "" then self~args~remove(i)
- end
- self~args = self~args~makeArray
-
- if self~args~items == 0 then do
- self~help = .true
- return
- end
-
-::method parseOption
- use strict arg option
- select
- when "-debug"~caseLessEquals(option) then do
- self~debug = .true
- self~debugOption = option
- end
- when "-dsssl"~caseLessEquals(option) then do
- self~dsssl = .true
- self~dssslOption = option
- end
- when "-dump"~caseLessEquals(option) then do
- self~dump = .true
- self~dumpOption = option
- end
- when "-help"~caseLessEquals(option) then do
- self~help = .true
- self~helpOption = option
- end
- when "-xinclude"~caseLessEquals(option) then do
- self~xinclude = .true
- self~xincludeOption = option
- end
- when "-xslt"~caseLessEquals(option) then do
- self~xslt = .true
- self~xsltOption = option
- end
- otherwise return 0
- end
- return 1
-
-::method verifyOptions
- self~errors~empty
- if self~dsssl & self~xslt then self~errors~append("[error] You can't specify both "self~dssslOption" and "self~xsltOption)
- if self~xinclude & \self~xslt then self~errors~append("[error] You must specify -xslt if you want to use "self~xincludeOption)
- return self~errors~isEmpty
Modified: incubator/DocMusings/makevalidxml/directory.rex
===================================================================
--- incubator/DocMusings/makevalidxml/directory.rex 2009-09-13 20:28:12 UTC (rev 5166)
+++ incubator/DocMusings/makevalidxml/directory.rex 2009-09-13 21:17:27 UTC (rev 5167)
@@ -1,5 +1,5 @@
::routine createDirectory public
- -- Creates the specified directory.
+ -- Creates the specified directory (and recursively the parents if needed).
-- Returns 0 if the directory already exists.
-- Returns 1 if the directory has been created.
-- Returns -1 if the creation failed because a file (not a directory) with the same name already exists.
@@ -7,6 +7,10 @@
use strict arg path
if SysIsFileDirectory(path) then return 0
if SysIsFile(path) then return -1
+ parent = filespec("location", path)
+ if parent == path then parent = filespec("location", path~substr(1, path~length - 1))
+ parentStatus = createDirectory(parent)
+ if parentStatus < 0 then return parentStatus
if SysMkDir(path) <> 0 then return -2
return 1
Added: incubator/DocMusings/makevalidxml/indentedstream.cls
===================================================================
--- incubator/DocMusings/makevalidxml/indentedstream.cls (rev 0)
+++ incubator/DocMusings/makevalidxml/indentedstream.cls 2009-09-13 21:17:27 UTC (rev 5167)
@@ -0,0 +1,86 @@
+/*
+Stream helper to manage indentation.
+The methods "charout", "lineout", and "say" takes care of the indentation and
+then forwards to the stream. Other methods are directly forwarded to the stream.
+
+Example :
+s = .IndentedStream~new(.stdout)
+s~lineout("<book>")
+s~indent
+s~lineout("<chapter>")
+s~lineout("</chapter>")
+s~dedent
+s~lineout("</book>")
+
+Output :
+<book>
+ <chapter>
+ </chapter>
+</book>
+*/
+
+.IndentedStream~stdout = .IndentedStream~new(.stdout)
+.IndentedStream~stderr = .IndentedStream~new(.stderr)
+
+
+::class "IndentedStream" public
+::constant indentSize 4
+::attribute stdout class
+::attribute stderr class
+::attribute indentLevel
+::attribute mustIndent
+::attribute spaces
+::attribute stream
+
+
+::method init
+ use strict arg stream, indentSize = (self~indentSize)
+ self~indentLevel = 0
+ self~mustIndent = .true
+ self~spaces = " "~copies(indentSize)
+ self~stream = stream
+
+
+::method indent
+ use strict arg -- none
+ self~indentLevel +=1
+ return self
+
+
+::method dedent
+ use strict arg -- none
+ self~indentLevel -=1
+ return self
+
+
+::method indentIfNeeded private
+ use strict arg -- none
+ if self~mustIndent then do
+ loop self~indentLevel
+ self~stream~charout(self~spaces)
+ end
+ self~mustIndent = .false
+ end
+
+
+::method charout
+ self~indentIfNeeded
+ forward to (self~stream)
+
+
+::method lineout
+ self~indentIfNeeded
+ self~mustIndent = .true
+ forward to (self~stream)
+
+
+::method say
+ self~indentIfNeeded
+ self~mustIndent = .true
+ forward to (self~stream)
+
+
+::method unknown
+ use strict arg msg, args
+ forward to (self~stream) message (msg) arguments (args)
+
Modified: incubator/DocMusings/makevalidxml/myxmlparser.cls
===================================================================
--- incubator/DocMusings/makevalidxml/myxmlparser.cls 2009-09-13 20:28:12 UTC (rev 5166)
+++ incubator/DocMusings/makevalidxml/myxmlparser.cls 2009-09-13 21:17:27 UTC (rev 5167)
@@ -574,7 +574,16 @@
/*----------------------------------------------------------------------------*/
+/* Method: cdata_text */
+/* Description: text of the cdata section, without the start/end tags */
/*----------------------------------------------------------------------------*/
+::method cdata_text
+ if self~text~left(8) <> "![CDATA[" then return .nil
+ if self~text~right(2) <> "]]" then return .nil
+ return self~text~substr(9, self~text~length - 10)
+
+/*----------------------------------------------------------------------------*/
+/*----------------------------------------------------------------------------*/
/* Class: XMLERROR */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
Added: incubator/DocMusings/makevalidxml/sd2image.rex
===================================================================
--- incubator/DocMusings/makevalidxml/sd2image.rex (rev 0)
+++ incubator/DocMusings/makevalidxml/sd2image.rex 2009-09-13 21:17:27 UTC (rev 5167)
@@ -0,0 +1,177 @@
+/****
+Usage :
+ $sourcename [-help] <sd_File> [<logFile>]
+Description :
+ Create a subdirectory <sd_File> (without suffix) and generate an image file
+ for each syntax diagram or fragment in <sd_File>.
+Prerequisites :
+ Depends on xsltproc.
+ Assumes that the XSLT script to generate SVG is located in the directory
+ ../railroad/syntaxdiagram2svg (path relative to current script directory).
+ Depends on the environment variable BATIK_ROOT to retrieve Batik.
+****/
+
+log = .stderr
+parse source . callType me
+arguments = .Arguments~new(callType, arg(1,"array"))
+if arguments~help then call Help me
+do error over arguments~errors
+ log~lineout(error)
+end
+if arguments~help | \arguments~errors~isEmpty then return 1
+
+if arguments~logFile <> "" then log = .stream~new(arguments~logFile)
+
+sdFile = qualify(arguments~sdFile)
+
+if \SysFileExists(sdFile) then do
+ log~lineout("[error] <sd_File> not found")
+ return 1
+end
+
+sdFileDir = filespec("location", sdFile)
+sdFileNameExt = filespec("name", sdFile)
+sdFileExt = filespec("extension", sdFile)
+sdFileName = sdFileNameExt~left(sdFileNameExt~length - sdFileExt~length - 1)
+
+-- Each sd_File has its own subdirectory :
+-- mydir/sd_myfile.xml --> mydir/sd_myfile/
+outputDir = sdFileDir || sdFileName
+if createDirectoryVerbose(outputDir, log) < 0 then return 1
+
+-- For the time being, the path to syntaxdiagram2svg is derived from the current script path,
+-- assuming it is located in the ../railroad/syntaxdiagram2svg directory
+-- (incubator directories)
+meDir = filespec("location", me)
+if meDir~right(1)~matchChar(1, "/\") then meDir = meDir~left(meDir~length - 1)
+meUpperDir = filespec("location", meDir)
+syntaxdiagram2svg = qualify(meUpperDir || "railroad/syntaxdiagram2svg")
+
+if \SysFileExists(syntaxdiagram2svg"/transform.xsl") then do
+ log~lineout("[error] XSLT script for SVG generation not found")
+ return 1
+end
+
+csspath = qualify(syntaxdiagram2svg"/css")
+jspath = qualify(syntaxdiagram2svg"/js")
+xsltscript = qualify(syntaxdiagram2svg"/transform.xsl")
+xsltproc_log = qualify(outputDir"/_xsltproc.log")
+
+-- Todo : test under Linux, is file:/// supported ?
+-- Something sure : I need it under Windows, otherwise absolute path not supported
+'xsltproc --stringparam CSSPATH file:///"'csspath'"',
+ '--stringparam JSPATH file:///"'jspath'"',
+ '--stringparam OUTPUTDIR file:///"'outputDir'"',
+ '"'xsltscript'"',
+ '"'sdFile'"',
+ '>"'xsltproc_log'" 2>&1'
+if RC <> 0 then do
+ log~lineout("[error] XSLT script for SVG generation failed")
+ return 1
+end
+
+BATIK_ROOT = value("BATIK_ROOT",,"ENVIRONMENT")
+if BATIK_ROOT == "" then do
+ log~lineout("[error] The environment variable BATIK_ROOT has no value")
+end
+
+rasterizer = qualify(BATIK_ROOT"/extensions/batik-rasterizer-ext.jar")
+allsvg = qualify(outputDir"/*.svg")
+batik_log = qualify(outputDir"/_batik.log")
+
+-- SVG to PNG
+-- If I surround allsvg by quotes then the wildcard character no longer works.
+-- so don't use a path with spaces...
+'java -Xmx1024M -jar "'rasterizer'" -onload -m image/png 'allsvg' >"'batik_log'" 2>&1'
+if RC <> 0 then do
+ log~lineout("[error] Batik's rasterizer for SVG to PNG generation failed")
+ return 1
+end
+
+-- SVG to PDF (better quality)
+-- If I surround allsvg by quotes then the wildcard character no longer works.
+-- so don't use a path with spaces...
+'java -Xmx1024M -jar "'rasterizer'" -onload -m application/pdf 'allsvg' >"'batik_log'" 2>&1'
+if RC <> 0 then do
+ log~lineout("[error] Batik's rasterizer for SVG to PDF generation failed")
+ return 1
+end
+
+return 0
+
+::requires 'string2args.rex'
+::requires 'help.rex'
+::requires 'directory.rex'
+
+-------------------------------------------------------------------------------
+::class Arguments subclass Directory
+::attribute args
+
+
+::method init
+ self~errors = .List~new
+ self~help = .false
+ self~logFile = ""
+ self~sdFile = ""
+
+ -- Tokenize the arguments, if needed
+ use strict arg callType, arguments -- always an array
+ select
+ when callType == "COMMAND" & arguments~items == 1 then self~args = String2Args(arguments[1])
+ when callType == "SUBROUTINE" & arguments~items == 1 & arguments[1]~isA(.array) then self~args = arguments[1]
+ otherwise self~args = arguments
+ end
+
+ -- Use makeArray to have a non-sparse array,
+ -- because omitted parameters have no corresponding index,
+ -- and we ignore omitted parameters here.
+ loop i=1 to self~args~items
+ if self~args[i] == "" then self~args~remove(i)
+ end
+ self~args = self~args~makeArray
+
+ if self~args~items == 0 then do
+ self~help = .true
+ return
+ end
+
+ -- Process the options
+ loop i=1 to self~args~items
+ option = self~args[i]
+ if option~left(1) <> "-" then leave
+ select
+ when "-help"~caseLessEquals(option) then do
+ self~help = .true
+ self~helpOption = option
+ end
+ otherwise do
+ self~errors~append("[error] Unknown option" option)
+ return
+ end
+ end
+ -- Return now if help requested
+ if self~help then return
+ end
+
+ -- Process the arguments
+ -- sdFile is mandatory
+ if i > self~args~items then do
+ self~errors~append("[error] <sdFile> is missing")
+ return
+ end
+ self~sdFile = self~args[i]~strip
+ i += 1
+
+ -- logFile is optional
+ if i > self~args~items then return
+ self~logFile = self~args[i]~strip
+ if self~logFile~left(1) == "-" then do
+ self~errors~append("[error] Options are before <inputFilename>")
+ return
+ end
+ i += 1
+
+ -- no more argument expected
+ if i > self~args~items then return
+ self~errors~append("[error] Unexpected arguments :" self~args~section(i)~toString("L", " "))
+ return
Added: incubator/DocMusings/makevalidxml/sdparser.cls
===================================================================
--- incubator/DocMusings/makevalidxml/sdparser.cls (rev 0)
+++ incubator/DocMusings/makevalidxml/sdparser.cls 2009-09-13 21:17:27 UTC (rev 5167)
@@ -0,0 +1,941 @@
+::requires "sdtokenizer"
+::requires "indentedstream.cls"
+::requires "trace.cls"
+
+
+-------------------------------------------------------------------------------
+/*
+The abstract syntax is made of DITA elements :
+syntaxdiagram
+groupseq
+groupchoice
+groupcomp
+fragment
+fragref
+kwd
+var
+oper
+delim
+sep
+repsep
+See http://www.ditainfocenter.com/eclipsehelp/index.jsp?topic=/org.ditausers.infomanager.LangSpec1.1/common/pr-d.html
+*/
+
+::class "SyntaxDiagramParser" public
+::attribute errorCount
+::attribute mainEntries
+::attribute messages -- list of strings
+::attribute nameGenerator -- any object which understands the message syntax_diagram_name(prefix, suffix)
+::attribute tokenizer
+
+
+::method parse class
+ use strict arg tokenizer, nameGenerator
+ parser = self~new(tokenizer, nameGenerator)
+ tokenizer~clearVisitMarks
+ tokenizer~dereferenceContinuations = .true
+ do entry over tokenizer~mainEntries
+ select
+ when entry~isA(.sdtokenizer~BeginingOfStatement) then do
+ syntaxDiagram = .SyntaxDiagram~new(.nil, parser)
+ parser~mainEntries~append(syntaxDiagram)
+ syntaxDiagram~parse(entry)
+ syntaxDiagram = syntaxDiagram~simplify
+ syntaxDiagram = syntaxDiagram~groupify
+ syntaxDiagram~hrefbase = syntaxDiagram~title
+ end
+ when entry~isA(.sdtokenizer~BeginingOfFragment) then do
+ -- Each fragment is stored in a separate syntax diagram.
+ -- Why ? because sometimes, the fragment is alone in the text,
+ -- with no preceding syntax diagram. So adopt a common behavior,
+ -- whatever the situation.
+ -- No title for this syntax diagram => no image will be generated by syntaxdiagram2svg
+ syntaxDiagram = .SyntaxDiagram~new(.nil, parser)
+ parser~mainEntries~append(syntaxDiagram)
+ fragment = .Fragment~new(syntaxDiagram, parser)
+ -- a title will be assigned to this fragment ==> an image will be generated by syntaxdiagram2svg
+ fragment~parse(entry)
+ fragment = fragment~simplify
+ fragment = fragment~groupify
+ syntaxDiagram~hrefbase = fragment~title
+ end
+ when entry~isA(.sdtokenizer~Comment) then do
+ comment = .Comment~new(.nil, parser)
+ parser~mainEntries~append(comment)
+ comment~parse(entry)
+ -- no need of simplification
+ end
+ otherwise parser~addError("[error] Unexpected token : "entry)
+ end
+ end
+ if \ tokenizer~checkVisitCompletness then parser~addError("[error] Some tokens have not been visited")
+ return parser
+
+
+::method init
+ use strict arg tokenizer, nameGenerator
+ self~errorCount = 0
+ self~mainEntries = .List~new
+ self~messages = .List~new
+ self~nameGenerator = nameGenerator
+ self~tokenizer = tokenizer
+ self~objectName = self~class~id
+
+
+::method addMessage
+ use strict arg message
+ self~messages~append(message)
+
+
+::method addError
+ use strict arg message
+ self~addMessage(message)
+ self~errorCount += 1
+ -- raise syntax 4
+
+
+::method dumpAbstractSyntaxTree
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout("Abstract syntax tree :")
+ do entry over self~mainEntries
+ entry~inspect(stream)
+ end
+
+
+::method dump
+ use strict arg stream = (.IndentedStream~stdout)
+ self~tokenizer~dumpTextWithPointers(stream)
+ stream~lineout("")
+ -- diagnostic
+ if self~errorCount == 0 then stream~lineout("Parsing : OK")
+ else stream~lineout("Parsing : KO")
+ do m over self~messages
+ stream~lineout(m)
+ end
+ stream~lineout("")
+ -- tokens not visited
+ if \ self~tokenizer~visitIsComplete then do
+ stream~lineout("Tokens not visited :")
+ self~tokenizer~dumpNotVisitedTokens(stream)
+ stream~lineout("")
+ end
+ -- Abstract syntax elements
+ self~dumpAbstractSyntaxTree(stream)
+
+
+::method inspect -- for debug
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout(self~string" :")
+ stream~lineout("")
+ self~dump(stream)
+
+
+-------------------------------------------------------------------------------
+::class "AbstractSyntaxElement"
+::attribute childs
+
+::attribute firstToken get
+::attribute firstToken set
+ expose firstToken
+ use strict arg firstToken
+ self~updateObjectName
+
+::attribute lastToken
+
+::attribute parser
+
+::attribute text get
+::attribute text set
+ expose text
+ use strict arg text
+ self~updateObjectName
+
+::constant default 1
+::constant required 2
+::constant optional 3
+::attribute importance
+
+::method importanceString
+ use strict arg -- none
+ select
+ when self~importance == .AbstractSyntaxElement~default then return "default"
+ when self~importance == .AbstractSyntaxElement~required then return "required"
+ when self~importance == .AbstractSyntaxElement~optional then return "optional"
+ otherwise return self~importance
+ end
+
+
+::method init
+ use strict arg parent, parser
+ self~childs = .List~new
+ self~firstToken = .nil
+ self~lastToken = .nil
+ if parent <> .nil then parent~childs~append(self)
+ self~parser = parser
+ self~text = ""
+ self~importance = 0
+
+
+::method parse
+ -- the simple case : the grammar element is made of one single token
+ use strict arg token
+ self~firstToken = token
+ self~lastToken = token
+ self~text = token~value
+ token~setVisitMark
+ return token~rightToken -- next token to parse
+
+
+::method updateObjectName
+ objectName = "The '"self~text"' "self~class~id
+ if self~firstToken <> .nil then do
+ objectName ||= " at ["self~firstToken~firstSymbol~line","self~firstToken~firstSymbol~col"]"
+ end
+ self~objectName = objectName
+
+
+::method simplify
+ -- The parser creates a lot of intermediate groups that can be eliminated
+ if .t~istraced("simplify") then call trace .t~traceoption("simplify")
+ use strict arg -- none
+ if self~childs~isEmpty then return self
+ newChilds = .List~new
+ do child over self~childs
+ newChild = child~simplify
+ newChilds~append(newChild)
+ end
+ self~childs = newChilds
+ return self
+
+
+::method groupify
+ -- syntaxdiagram and fragment can have only groups, fragref or fragment as childs.
+ -- The simplification may move kwd, var or other similar elements to toplevel.
+ -- This method takes care to put them back in a group.
+ use strict arg -- none
+ newChilds = .List~new
+ group = .nil
+ do child over self~childs
+ if child~isA(.Kwd) | child~isA(.Var) | child~isA(.Characters) then do
+ if group == .nil then do
+ group = .Group~new(.nil, self~parser)
+ group~kind = .Group~groupseq
+ end
+ group~childs~append(child)
+ end
+ else do
+ if group <> .nil then do
+ newChilds~append(group)
+ group = .nil
+ end
+ newChilds~append(child)
+ end
+ end
+ if group <> .nil then newChilds~append(group)
+ self~childs = newChilds
+ return self
+
+
+::method inspect -- for debug
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout(self~string)
+ stream~indent
+ self~inspectProperties(stream)
+ stream~lineout("childs :")
+ stream~indent
+ do child over self~childs
+ child~inspect(stream)
+ end
+ stream~dedent
+ stream~dedent
+
+
+::method inspectProperties -- for debug
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout("firstToken = "self~firstToken)
+ stream~lineout("lastToken = "self~lastToken)
+ -- stream~lineout("parser = "self~parser)
+ stream~lineout("text = '"self~text"'")
+ stream~lineout("importance = "self~importanceString)
+
+
+-------------------------------------------------------------------------------
+::class "SyntaxDiagram" subclass "AbstractSyntaxElement"
+::attribute hasFirstIdentifier -- to make proper distinction between kwd and var
+::attribute hrefbase -- the basename that will be used to build the complete image href
+::attribute title
+
+
+::method init
+ forward class (super) continue
+ self~hasFirstIdentifier = .false
+ self~hrefbase = ""
+ self~title = ""
+
+
+::method text
+ use strict arg -- none
+ return self~title
+
+
+::method parse
+ -- ( ( title) (optional) then ( groupseq or groupchoice or groupcomp or fragref or fragment or synblk or synnote or synnoteref) (any number) )
+ use strict arg beginingOfStatement
+ self~firstToken = beginingOfStatement
+ beginingOfStatement~setVisitMark
+ self~title = self~parser~nameGenerator~syntax_diagram_name("sd", beginingOfStatement~label) -- will be used as filename when generating the SVG
+ token = beginingOfStatement~rightToken
+ do forever
+ group = .Group~new(.nil, self~parser)
+ token = group~parse(token)
+ if group~childs~isEmpty then leave
+ self~childs~append(group)
+ end
+ if token~isA(.sdtokenizer~EndOfStatement) then do
+ token~setVisitMark
+ self~lastToken = token
+ end
+ else self~parser~addError("[error] Expected an EndOfStatement, got : "token)
+ return .nil -- no more tokens
+
+
+::method inspectProperties
+ use strict arg stream = (.IndentedStream~stdout)
+ self~inspectProperties:super(stream)
+ stream~lineout("title = '"self~title"'")
+
+
+-------------------------------------------------------------------------------
+::class "Fragment" subclass "AbstractSyntaxElement"
+::attribute hasFirstIdentifier -- to make proper distinction between kwd and var
+::attribute title
+
+
+::method init
+ forward class (super) continue
+ self~hasFirstIdentifier = .false
+ self~title = ""
+
+
+::method text
+ use strict arg -- none
+ return self~title
+
+
+::method parse
+ -- ( ( title) (optional) then ( groupseq or groupchoice or groupcomp or fragref or synnote or synnoteref) (any number) )
+ use strict arg beginingOfFragment
+ self~firstToken = beginingOfFragment
+ beginingOfFragment~setVisitMark
+ self~title = self~parser~nameGenerator~syntax_diagram_name("sd", beginingOfFragment~label) -- will be used as filename when generating the SVG
+ token = beginingOfFragment~rightToken
+ do forever
+ group = .Group~new(.nil, self~parser)
+ token = group~parse(token)
+ if group~childs~isEmpty then leave
+ self~childs~append(group)
+ end
+ if token~isA(.sdtokenizer~EndOfFragment) then do
+ token~setVisitMark
+ self~lastToken = token
+ end
+ else self~parser~addError("[error] Expected an EndOfFragment, got : "token)
+ return .nil -- no more tokens
+
+
+::method inspectProperties
+ use strict arg stream = (.IndentedStream~stdout)
+ self~inspectProperties:super(stream)
+ stream~lineout("title = '"self~title"'")
+
+
+-------------------------------------------------------------------------------
+::class "Group" subclass "AbstractSyntaxElement"
+::constant groupseq 1
+::constant groupchoice 2
+::constant groupcomp 3
+::attribute kind
+
+::method kindString
+ use strict arg -- none
+ select
+ when self~kind == .Group~groupseq then return "groupseq"
+ when self~kind == .Group~groupchoice then return "groupchoice"
+ when self~kind == .Group~groupcomp then return "groupcomp"
+ otherwise return self~kind
+ end
+
+::attribute repsep
+
+::attribute r2lDispatcher private -- used by ParseChoice, ParseRepeated
+::attribute endMismatch private -- used by ParseChoice, ParseRepeated
+
+::method init
+ forward class (super) continue
+ self~kind = 0
+ self~repsep = .nil
+
+
+::method parse
+ -- ( ( title) (optional) then ( repsep) (optional) then ( groupseq or groupchoice or groupcomp or fragref or kwd or var or delim or oper or sep or synnote or synnoteref) (any number) )
+ if .t~istraced("parse") then call trace .t~traceoption("parse")
+ use strict arg token
+ self~firstToken = token
+ token = self~parseGroup(token)
+ if self~kind == 0 then self~kind = .Group~groupseq -- an empty group can occur (for example) in a choice : the main path can be empty if no required value
+ if token <> .nil then self~lastToken = token~leftToken
+ return token
+
+
+::method parseGroup private
+ /*
+ We test if the repeat arrow has been already visited because parseRepeated will call parseGroup
+ recursively on the same token when the repeat is like that :
+ +-,--------+
+ V |
+ --+----------+--
+ +-,-string-+
+ */
+ if .t~istraced("parseGroup") then call trace .t~traceoption("parseGroup")
+ use strict arg token
+ -- skip dash separators
+ do while token~isA(.sdtokenizer~l2rPath)
+ if token~upperToken~isA(.sdtokenizer~RepeatArrow) then do
+ if \ token~upperToken~alreadyVisited then leave
+ end
+ token~setVisitMark
+ token = token~rightToken
+ end
+ if token == .nil then return .nil
+ if token~upperToken~isA(.sdtokenizer~RepeatArrow), \ token~upperToken~alreadyVisited then return self~parseRepeated(token)
+ if token~isA(.sdtokenizer~l2rCrossroads) then do
+ if token~isl2rDispatcher then return self~parseChoice(token)
+ if token~isr2lDispatcher then return token
+ self~parser~addError("[error] "token" has both L2R and R2L paths")
+ return .nil
+ end
+ return self~parseSeqComp(token)
+
+
+::method parseSeqComp private
+ -- groupseq or groupcomp
+ if .t~istraced("parseSeqComp") then call trace .t~traceoption("parseSeqComp")
+ use strict arg token
+ do forever
+ previousToken = token
+ select
+ when token~isA(.sdtokenizer~Identifier) then do
+ -- try to improve the distinction kwd/var : if first identifier of the current entry then it's a kwd (it's rare to have a variable as first element)
+ if self~parser~mainEntries~lastItem~hasFirstIdentifier == .false then token = .Kwd~new(self, self~parser)~parse(token)
+ -- ALL UPPERCASE (i.e. no lowercase) ==> keyword
+ else if token~hasLowerCase == .false then token = .Kwd~new(self, self~parser)~parse(token)
+ -- After a "~" we have a kwd (most of the time)
+ else if token~leftToken <> .nil, token~leftToken~value == "~" then token = .Kwd~new(self, self~parser)~parse(token)
+ -- Before a "(", we have a kwd (most of the time)
+ else if token~rightToken <> .nil, token~rightToken~value == "(" then token = .Kwd~new(self, self~parser)~parse(token)
+ -- The doc says "Variables appear in all lowercase letters" but this is not true...
+ -- We have a lot of variables whose name is a mix of upper- and lowercase letters, so...
+ else token = .Var~new(self, self~parser)~parse(token)
+ self~parser~mainEntries~lastItem~hasFirstIdentifier = .true
+ end
+ when token~isA(.sdtokenizer~Number) then token = .Kwd~new(self, self~parser)~parse(token)
+ when token~isA(.sdtokenizer~Character) then token = .Characters~new(self, self~parser)~parse(token)
+ when token~isA(.sdtokenizer~EscapedCharacter) then token = .Characters~new(self, self~parser)~parse(token)
+ when token~isA(.sdtokenizer~BeginingOfFragmentReference) then token = .FragRef~new(self, self~parser)~parse(token)
+ otherwise return token -- end of group
+ end
+
+ -- Determine the kind of group (if first iteration) or test if the group is ended.
+ -- Got a problem with +-| repetitor |-+ (no separators here, tokens were "-| " and " |-")
+ -- because the fragref was put in a groupcomp, and the generated SVG was not good (empty box).
+ -- To bypass that, I modified the tokenizer to recognize "| " and ' |".
+ -- The problem will still occur with such (unlikely) string : before| repetitor |after.
+ if token~isA(.sdtokenizer~l2rPath) then do
+ if self~kind == .Group~groupcomp then return token -- we have reached a separator --> end of groupcomp
+ if self~kind == 0 then self~kind = .Group~groupseq -- we have a separator after the first element --> groupseq
+ end
+ else do
+ if self~kind == .Group~groupseq then do
+ -- the current grammar element is not followed by a separator --> he's part of a groupcomp
+ -- must undo the last append and close the groupseq (the visit marks are not undo'ed, not a problem)
+ self~childs~remove(self~childs~last)
+ return previousToken
+ end
+ if self~kind == 0 then self~kind = .Group~groupcomp -- we don't have a separator after the first element --> groupcomp
+ end
+
+ -- skip dash separators
+ do while token~isA(.sdtokenizer~l2rPath), \ token~upperToken~isA(.sdtokenizer~RepeatArrow)
+ token~setVisitMark
+ token = token~rightToken
+ end
+ end
+
+
+::method parseChoice private
+ if .t~istraced("parseChoice") then call trace .t~traceoption("parseChoice")
+ use strict arg l2rDispatcher
+ l2rDispatcher~setVisitMark
+ self~kind = .Group~groupchoice
+ self~r2lDispatcher = .nil -- choice's end, will be calculated for each path, must be the same for all paths
+ self~endMismatch = .false
+
+ -- default choice (above main path, normally only one path, but support several paths above the main path)
+ token = l2rDispatcher~upperToken
+ if \ token~isA(.sdtokenizer~RepeatArrow) then do
+ if self~parseChoiceAboveMainPath(token) == .nil then return .nil
+ end
+ -- main path
+ if self~parseChoiceMainPath(l2rDispatcher) == .nil then return .nil
+ -- optional choices (below main path)
+ if self~parseChoiceBelowMainPath(l2rDispatcher~lowerToken) == .nil then return .nil
+
+ if self~endMismatch then do
+ self~parser~addError( "[error] The choice "l2rDispatcher" does not end consistently")
+ end
+ if self~r2lDispatcher == .nil then return .nil
+
+ /*
+ Next test needed to support that : the [+] is both the end of the enclosed group
+ and the end of the repeat main path.
+ +--------------------+
+ V |
+ ---+-BORDERSELECT----[+]-
+ +-CHECKBOXES-------+
+ */
+ if self~r2lDispatcher~annotation == "EndOfRepeat" then return self~r2lDispatcher
+ return self~r2lDispatcher~rightToken
+
+
+::method parseChoiceAboveMainPath private
+ -- default choice (above main path, normally only one path, but support several paths above the main path)
+ if .t~istraced("parseChoiceAboveMainPath") then call trace .t~traceoption("parseChoiceAboveMainPath")
+ use strict arg token
+ previousToken = .nil -- for accurate error message
+ do forever
+ select
+ when token == .nil then leave -- end of b2t vertical path
+ when token~isA(.sdtokenizer~b2tPath) then token~setVisitMark
+ when token~isA(.sdtokenizer~b2tCrossroads) | token~isA(.sdtokenizer~b2tCorner) then do
+ token~setVisitMark
+ -- follow the horizontal path of default choice
+ group = .Group~new(.nil, self~parser)
+ group~importance = self~default
+ -- the topmost path must be the first : we are moving b2t so must always be the first child
+ self~childs~insert(group, .nil)
+ endTokenClass = token~class
+ if endTokenClass == .sdtokenizer~b2tCorner then endTokenClass = .sdtokenizer~l2rCorner -- todo : should modify the tokenizer to be symmetrical
+ endtoken = group~parseDelimitedPath(token, .false, endTokenClass)
+ if endtoken == .nil then return .nil
+ -- follow the t2b vertical path to find the choice's end
+ choiceEnd = self~getr2lDispatcher(endtoken, "t2b")
+ if choiceEnd == .nil then return .nil
+ if self~r2lDispatcher == .nil then self~r2lDispatcher = choiceEnd
+ else if self~r2lDispatcher <> choiceEnd then self~endMismatch = .true
+ end
+ otherwise do
+ if token == .nil & previousToken == .nil then self~parser~addError("[error] Unexpected NIL token while walking b2t (parseChoiceAboveMainPath)")
+ else if token == .nil then self~parser~addError("[error] Unexpected NIL token while walking b2t from "previousToken)
+ else self~parser~addError("[error] Unexpected token while walking b2t : "token)
+ return .nil
+ end
+ end
+ previousToken = token
+ token = token~upperToken
+ end
+ return .true -- no token to return here, so...
+
+
+::method parseChoiceMainPath private
+ -- main path
+ if .t~istraced("parseChoiceMainPath") then call trace .t~traceoption("parseChoiceMainPath")
+ use strict arg token
+ group = .Group~new(self, self~parser)
+ -- group~importance = self~required -- not needed
+ endtoken = group~parseDelimitedPath(token, .false, .sdtokenizer~l2rCrossroads)
+ if endtoken == .nil then return .nil
+ endtoken~setVisitMark
+ if self~r2lDispatcher == .nil then self~r2lDispatcher = endtoken
+ else if self~r2lDispatcher <> endtoken then self~endMismatch = .true
+ return .true -- no token to return here, so...
+
+
+::method parseChoiceBelowMainPath private
+ -- optional choices (below main path)
+ if .t~istraced("parseChoiceBelowMainPath") then call trace .t~traceoption("parseChoiceBelowMainPath")
+ use strict arg token
+ previousToken = .nil -- for accurate error message
+ do forever
+ select
+ when token == .nil then leave -- end of t2b vertical path
+ when token~isA(.sdtokenizer~t2bPath) then token~setVisitMark
+ when token~isA(.sdtokenizer~t2bCrossroads) | token~isA(.sdtokenizer~t2bCorner) then do
+ token~setVisitMark
+ -- follow the horizontal path of optional choice
+ group = .Group~new(self, self~parser)
+ -- group~importance = self~optional -- not needed
+ endTokenClass = token~class
+ if endTokenClass == .sdtokenizer~t2bCorner then endTokenClass = .sdtokenizer~l2rCorner -- todo : should modify the tokenizer to be symmetrical
+ endtoken = group~parseDelimitedPath(token, .false, endTokenClass)
+ if endtoken == .nil then return .nil
+ -- follow the b2t vertical path to find the choice's end
+ choiceEnd = self~getr2lDispatcher(endtoken, "b2t")
+ if choiceEnd == .nil then return .nil
+ if self~r2lDispatcher == .nil then self~r2lDispatcher = choiceEnd
+ else if self~r2lDispatcher <> choiceEnd then self~endMismatch = .true
+ end
+ otherwise do
+ if token == .nil & previousToken == .nil then self~parser~addError("[error] Unexpected NIL token while walking t2b (parseChoiceBelowMainPath)")
+ else if token == .nil then self~parser~addError("[error] Unexpected NIL token while walking t2b from "previousToken)
+ else self~parser~addError("[error] Unexpected token while walking t2b : "token)
+ return .nil
+ end
+ end
+ previousToken = token
+ token = token~lowerToken
+ end
+ return .true -- no token to return here, so...
+
+
+::method parseRepeated private
+ if .t~istraced("parseRepeated") then call trace .t~traceoption("parseRepeated")
+ use strict arg tokenUnderArrow
+ self~r2lDispatcher = .nil -- repeat's end, will be calculated for the repeat arrow and the repeated path, must be the same
+ self~endMismatch = .false
+ repeatArrow = tokenUnderArrow~upperToken
+ repeatArrow~setVisitMark
+ -- follow the b2t vertical end path of the repeat arrow, until the first fork is reached
+ -- and then parse the l2r horizontal path (will bring the repsep)
+ -- and then follow the t2b vertical path to reach the repeat's end
+ token = repeatArrow~upperToken
+ previousToken = .nil -- for accurate error message
+ do forever
+ select
+ when token~isA(.sdtokenizer~b2tPath) then token~setVisitMark
+ when token~isA(.sdtokenizer~b2tCrossroads) | token~isA(.sdtokenizer~b2tCorner) then do
+ token~setVisitMark
+ -- follow the horizontal path of the repeat arrow
+ self~repsep = .RepSep~new(.nil, self~parser)
+ endTokenClass = token~class
+ if endTokenClass == .sdtokenizer~b2tCorner then endTokenClass = .sdtokenizer~l2rCorner -- todo : should modify the tokenizer to be symmetrical
+ endtoken = self~repsep~parseDelimitedPath(token, .false, endTokenClass)
+ if endtoken == .nil then return .nil
+ -- follow the vertical start path of the repeat arrow
+ repeatEnd = self~getr2lDispatcher(endtoken, "t2b")
+ if repeatEnd == .nil then return .nil
+ if self~r2lDispatcher == .nil then self~r2lDispatcher = repeatEnd
+ else if self~r2lDispatcher <> repeatEnd then self~endMismatch = .true
+ repeatEnd~annotation = "EndOfRepeat"
+ -- support only one horizontal path (i.e. only one repeat arrow)
+ if token~upperToken <> .nil then self~parser~addError("{error] Not supported : More than one repeat path reaching "repeatArrow) -- not blocking
+ leave -- we have processed the first fork
+ end
+ otherwise do
+ if token == .nil & previousToken == .nil then self~parser~addError("[error] Unexpected NIL token while following repeat arrow")
+ else if token == .nil then self~parser~addError("[error] Unexpected NIL token while following repeat arrow from "previousToken)
+ else self~parser~addError("[error] Unexpected token while following repeat arrow "token)
+ return .nil
+ end
+ end
+ previousToken = token
+ token = token~upperToken
+ end
+
+ -- parse the main path
+ endtoken = self~parseDelimitedPath(tokenUnderArrow, .true, .sdtokenizer~l2rCrossroads)
+ if self~r2lDispatcher == .nil then self~r2lDispatcher = endtoken
+ else if self~r2lDispatcher <> endtoken then self~endMismatch = .true
+
+ if self~endMismatch then do
+ self~parser~addError( "[error] The repeat at "tokenUnderArrow" does not end consistently")
+ end
+ return self~r2lDispatcher~rightToken
+
+
+::method parseDelimitedPath private
+ if .t~istraced("parseDelimitedPath") then call trace .t~traceoption("parseDelimitedPath")
+ use strict arg startToken /*the start delimiter (+ or . or ...)*/, parseStart, endTokenClass
+ self~kind = .Group~groupseq
+ self~firstToken = startToken
+ if parseStart then token = startToken
+ else token = startToken~rightToken
+ do forever
+ group = .Group~new(.nil, self~parser)
+ token = group~parse(token)
+ if group~childs~isEmpty then leave
+ self~childs~append(group)
+ if token~isA(endTokenClass) then do
+ if \ token~isA(.sdtokenizer~l2rCrossroads) then leave
+ -- additional test needed for l2rCrossroads
+ if token~isr2lDispatcher then leave
+ end
+ end
+ if \ token~isA(endTokenClass) then do
+ self~parser~addError( "[error] While parsing delimited path from "startToken)
+ self~parser~addMessage(" Expected a "endTokenClass~id" as end token, got "token)
+ return .nil
+ end
+ token~setVisitMark
+ self~lastToken = token
+ return token -- the end delimiter (+ or .)
+
+
+::method getr2lDispatcher private
+ -- walk through the closing vertical path of a choice or repeat, searching for the "+" on the main path (the r2lDispatcher)
+ if .t~istraced("getr2lDispatcher") then call trace .t~traceoption("getr2lDispatcher")
+ use strict arg token, direction
+ if direction == "t2b" then token = token~lowerToken
+ if direction == "b2t" then token = token~upperToken
+ previousToken = .nil -- for accurate error message
+ do forever
+ select
+ when direction == "t2b", token~isA(.sdtokenizer~b2tPath) | token~isA(.sdtokenizer~b2tCrossroads) then do -- yes, b2t because the tokenization was made b2t
+ token~setVisitMark
+ token = token~lowerToken
+ end
+ when direction == "b2t", token~isA(.sdtokenizer~t2bPath) | token~isA(.sdtokenizer~t2bCrossroads) then do -- yes, t2b because the tokenization was made t2b
+ token~setVisitMark
+ token = token~upperToken
+ end
+ when token~isA(.sdtokenizer~l2rCrossroads) then leave -- we have reached the r2lDispatcher
+ otherwise do
+ if token == .nil & previousToken == .nil then self~parser~addError("[error] Unexpected NIL token while walking "direction" (getr2lDispatcher)")
+ else if token == .nil then self~parser~addError("[error] Unexpected NIL token while walking "direction" from "previousToken)
+ else self~parser~addError("[error] Unexpected token while walking "direction" : "token)
+ return .nil
+ end
+ end
+ previousToken = token
+ end
+ return token -- the r2lDispatcher
+
+
+::method simplify
+ if .t~istraced("simplify") then call trace .t~traceoption("simplify")
+ use strict arg -- none
+ group = self~simplify:super -- simplify each child separately
+ select
+ when group~repsep <> .nil then return group -- a group with repeat can't be simplified
+ when group~childs~items == 1 then do
+ -- a group with only one child can be replaced by its child (if no information loss)
+ child = group~childs~firstItem
+ if group~importance <> 0 & child~importance <> 0 then return group
+ if child~importance == 0 then child~importance = group~importance
+ return child
+ end
+ when group~kind == .Group~groupchoice & group~childs~items == 2 then do
+ -- a choice with 2 childs may be just an optional element (if main path is empty and no loss of information)
+ index1 = group~childs~first
+ path1 = group~childs[index1] -- not [1] ! todo : improve doc...
+ index2 = group~childs~next(index1)
+ path2 = group~childs[index2] -- not [2] ! todo : improve doc...
+ if path1~isA(.Group) & path1~childs~isEmpty & path2~importance == 0 then do
+ path2~importance = group~optional
+ return path2
+ end
+ return group
+ end
+ otherwise return group
+ end
+
+
+::method inspectProperties
+ use strict arg stream = (.IndentedStream~stdout)
+ self~inspectProperties:super(stream)
+ stream~lineout("kind = "self~kindString)
+ stream~charout("repsep : ")
+ if self~repsep == .nil then stream~lineout(self~repsep~string)
+ else do
+ stream~lineout("")
+ stream~indent
+ self~repsep~inspect(stream)
+ stream~dedent
+ end
+
+
+-------------------------------------------------------------------------------
+::class "RepSep" subclass "Group"
+
+
+::method init
+ forward class (super) continue
+ self~text = ""
+
+
+::method parse
+ forward class (super) continue
+ buffer = .MutableBuffer~new
+ -- For the time being, the repsep must be a simple text :
+ -- iterate over the grammar elements that can be converted to text, and concatenate
+ do child over self~childs
+ select
+ when child.isA(.Kwd) then buffer~append(child~text)
+ when child.isA(.Var) then buffer~append(child~text)
+ when child.isA(.Characters) then buffer~append(child~text)
+ otherwise do
+ self~parser~addError("[error] "child" is not supported in repsep")
+ end
+ end
+ end
+ self~text = buffer~string
+
+
+-------------------------------------------------------------------------------
+::class "FragRef" subclass "AbstractSyntaxElement"
+
+
+::method init
+ forward class (super) continue
+ self~text = ""
+
+
+::method parse
+ use strict arg beginingOfFragmentReference
+ self~firstToken = beginingOfFragmentReference
+ beginingOfFragmentReference~setVisitMark
+ token = beginingOfFragmentReference~rightToken
+ buffer = .MutableBuffer~new
+ -- iterate over the tokens that can be converted to text, and concatenate
+ do forever
+ select
+ when token~isA(.sdtokenizer~Identifier) then buffer~append(token~value)
+ when token~isA(.sdtokenizer~Character) then buffer~append(token~value)
+ when token~isA(.sdtokenizer~EscapedCharacter) then buffer~append(token~value)
+ when token~isA(.sdtokenizer~Number) then buffer~append(token~value)
+ otherwise leave
+ end
+ token~setVisitMark
+ token = token~rightToken
+ end
+ self~text = buffer~string
+ if \ token~isA(.sdtokenizer~EndOfFragmentReference) then do
+ self~parser~addError("[error] Expected an EndOfFragmentReference, got : "token)
+ return .nil
+ end
+ token~setVisitMark
+ self~lastToken = token
+
+ -- semantic check
+ if self~text == "" then do
+ self~parser~addError("[error] The text attribute of "self" is empty")
+ end
+
+ return token~rightToken
+
+
+-------------------------------------------------------------------------------
+::class "Kwd" subclass "AbstractSyntaxElement"
+
+
+-------------------------------------------------------------------------------
+::class "Var" subclass "AbstractSyntaxElement"
+
+
+-------------------------------------------------------------------------------
+::class "Characters" subclass "AbstractSyntaxElement"
+::constant delim 1
+::constant oper 2
+::constant sep 3
+::attribute kind
+
+::method kindString
+ use strict arg -- none
+ select
+ when self~kind == .Characters~delim then return "delim"
+ when self~kind == .Characters~oper then return "oper"
+ when self~kind == .Characters~sep then return "sep"
+ otherwise return self~kind
+ end
+
+
+::method init
+ forward class (super) continue
+ self~kind = 0
+ self~text = ""
+
+
+::method parse
+ use strict arg token
+ self~firstToken = token
+ buffer = .MutableBuffer~new
+ do forever
+ select
+ when token~isA(.sdtokenizer~Character) then buffer~append(token~value)
+ when token~isA(.sdtokenizer~EscapedCharacter) then buffer~append(token~value)
+ otherwise leave
+ end
+ token~setVisitMark
+ self~lastToken = token
+ token = token~rightToken
+ end
+ self~text = buffer~string
+ -- not sure that the distinction between oper, sep and delim is important.
+ select
+ when self~text == " " then self~kind = .Characters~oper
+ when self~text == "||" then self~kind = .Characters~oper
+ when self~text == "°" then self~kind = .Characters~oper
+ when self~text == "-" then self~kind = .Characters~oper
+ when self~text == "*" then self~kind = .Characters~oper
+ when self~text == "/" then self~kind = .Characters~oper
+ when self~text == "%" then self~kind = .Characters~oper
+ when self~text == "//" then self~kind = .Characters~oper
+ when self~text == "**" then self~kind = .Characters~oper
+ when self~text == "=" then self~kind = .Characters~oper
+ when self~text == "\=" then self~kind = .Characters~oper
+ when self~text == ">" then self~kind = .Characters~oper
+ when self~text == "<" then self~kind = .Characters~oper
+ when self~text == "><" then self~kind = .Characters~oper
+ when self~text == "<>" then self~kind = .Characters~oper
+ when self~text == ">=" then self~kind = .Characters~oper
+ when self~text == "\<" then self~kind = .Characters~oper
+ when self~text == "<=" then self~kind = .Characters~oper
+ when self~text == "\>" then self~kind = .Characters~oper
+ when self~text == "==" then self~kind = .Characters~oper
+ when self~text == "\==" then self~kind = .Characters~oper
+ when self~text == ">>" then self~kind = .Characters~oper
+ when self~text == "<<" then self~kind = .Characters~oper
+ when self~text == ">>=" then self~kind = .Characters~oper
+ when self~text == "\<<" then self~kind = .Characters~oper
+ when self~text == "<<=" then self~kind = .Characters~oper
+ when self~text == "\>>" then self~kind = .Characters~oper
+ when self~text == "&" then self~kind = .Characters~oper
+ when self~text == "|" then self~kind = .Characters~oper
+ when self~text == "&&" then self~kind = .Characters~oper
+ when self~text == "\" then self~kind = .Characters~oper
+ when self~text == "~" then self~kind = .Characters~oper
+ when self~text == "~~" then self~kind = .Characters~oper
+
+ when self~text == "," then self~kind = .Characters~sep
+ when self~text == ";" then self~kind = .Characters~sep
+ when self~text == ":" then self~kind = .Characters~sep
+
+ when self~text == "(" then self~kind = .Characters~delim
+ when self~text == ")" then self~kind = .Characters~delim
+ when self~text == "[" then self~kind = .Characters~delim
+ when self~text == "]" then self~kind = .Characters~delim
+
+ otherwise self~kind = .Characters~sep
+ end
+ return token
+
+
+::method inspectProperties
+ use strict arg stream = (.IndentedStream~stdout)
+ self~inspectProperties:super(stream)
+ stream~lineout("kind = "self~kindString)
+
+
+-------------------------------------------------------------------------------
+::class "Comment" subclass "AbstractSyntaxElement"
+
+
+-------------------------------------------------------------------------------
+::class sdparser public -- ako namespace
+
+
+::method unknown class
+ -- Collisions are possible with parser classes (ex : Comment).
+ -- This method forwards everything to the directory of classes of the package (ako namespace).
+ -- Usage : .sdparser~MyClass
+ forward to (.Context~package~classes)
+
Added: incubator/DocMusings/makevalidxml/sdtokenizer.cls
===================================================================
--- incubator/DocMusings/makevalidxml/sdtokenizer.cls (rev 0)
+++ incubator/DocMusings/makevalidxml/sdtokenizer.cls 2009-09-13 21:17:27 UTC (rev 5167)
@@ -0,0 +1,1847 @@
+::requires "rxregexp.cls"
+::requires "indentedstream.cls"
+::requires "trace.cls"
+
+
+-------------------------------------------------------------------------------
+/*
+The text of the syntax diagram is stored in a 2 dimensions array which contains
+symbols of one character.
+
+A symbol has a position in the array (line, col) and holds 4 strings,
+all starting with the symbol's character :
+- left to right string (l2r)
+- right to left string (r2l)
+- top to bottom string (t2b)
+- bottom to top string (b2t)
+During analysis, a symbol is associated to a token which, most of the time,
+covers several symbols.
+
+A token has a text and is chained in the four directions :
+- right token
+- left token
+- lower token
+- upper token
+
+Several syntax diagrams can be extracted from a single textual syntax diagram,
+depending on the number of begining of statements, or in case of fragments.
+*/
+
+::class "SyntaxDiagramTokenizer" public
+::constant lineDigits 3 -- number of digits when displaying the line number
+::attribute continuationCount -- counter of ContinuedOnNextLine already resolved
+::attribute continuationEntries -- array of ContinuedFromPreviousLine, direct access by index 1..n
+::attribute dereferenceContinuations -- when true, the continuations are automatically dereferenced (not visible from the parser when walking through the tokens)
+::attribute errorCount
+::attribute mainEntries -- list of BeginingOfStatement or BeginingOfFragment
+::attribute messages -- list of strings
+::attribute name -- syntax diagram name, not needed by the tokenizer, but it's a way to bring this info between the various steps
+::attribute symbols -- array[line,col] of Symbol
+::attribute text -- the text of the syntax diagram to tokenize (contains eol characters)
+::attribute tokenizationIsComplete
+::attribute tokens -- table of all tokens created during analysis, used to check the completness of analysis
+::attribute visitIsComplete
+
+
+::method tokenize class
+ use strict arg text, endofline
+ tokenizer = self~new(text, endofline)
+ if tokenizer~findEntries == .false then return .nil
+ -- From here,we assume that the text contains one or several syntax diagrams
+ if tokenizer~checkFirstLineVersusOtherLines == .false then return tokenizer
+ do entry over tokenizer~mainEntries
+ if entry~isA(.Comment) then iterate
+ tokenizer~l2rTokenize(entry, entry~nextSymbol) -- left to right
+ end
+ if tokenizer~checkTokenizationCompletness == .false then tokenizer~addError("[error] Some symbols have not been tokenized")
+ tokenizer~clearVisitMarks
+ do entry over tokenizer~mainEntries
+ entry~visit
+ end
+ if tokenizer~checkVisitCompletness == .false then tokenizer~addError("[error] Some tokens have not been visited")
+ return tokenizer
+
+
+::method init
+ use strict arg text, endofline
+ self~continuationCount = 0
+ self~continuationEntries = .Array~new
+ self~dereferenceContinuations = .false
+ self~errorCount = 0
+ self~mainEntries = .List~new
+ self~messages = .List~new
+ self~name = ""
+ self~text = text
+ self~tokenizationIsComplete = .false
+ self~tokens = .IdentityTable~new
+ self~visitIsComplete = .false
+ self~objectName = self~class~id
+
+ lineDimension = 0
+ colDimension = 0
+ i = 1
+ do while i <= text~length
+ lineDimension += 1
+ eolpos = text~pos(endofline, i)
+ if eolpos == 0 then lastCol = text~length
+ else lastCol = eolpos - 1
+ colCount = lastCol - i + 1
+ if colCount > colDimension then colDimension = colCount
+ i = lastCol + 1 + endofline~length
+ end
+
+ self~symbols = .Array~new(lineDimension, colDimension)
+
+ line = 0
+ i = 1
+ do while i <= text~length
+ line += 1
+ eolpos = text~pos(endofline, i)
+ if eolpos == 0 then lastCol = text~length
+ else lastCol = eolpos - 1
+ col = 0
+ do j = i to lastCol
+ col += 1
+ self~symbols[line, col] = .Symbol~new(text~subchar(j), self, line, col)
+ end
+ i = lastCol + 1 + endofline~length
+ end
+
+
+::method "[]"
+ use strict arg line, col
+ if line < 1 | line > self~lineDimension then return .nil
+ if col < 1 | col > self~colDimension then return .nil
+ symbol = self~symbols[line, col]
+ -- create a space symbol on the fly if needed (inside the limits, any unitialized cell is a space)
+ if symbol == .nil then self~symbols[line, col] = .Symbol~new(" ", self, line, col, .true)
+ return self~symbols[line, col]
+
+
+::method "[]="
+ use strict arg value, line, col
+ if line < 1 | line > self~lineDimension then raise syntax 93.900 array ("line must be in range 1.."self~lineDimension", got "line)
+ if col < 1 | col > self~colDimension then raise syntax 93.919 array ("col must be in range 1.."self~colDimension", got "col)
+ self~symbols[line, col] = value
+
+
+::method lineDimension
+ use strict arg -- none
+ return self~symbols~dimension(1)
+
+
+::method colDimension
+ use strict arg -- none
+ return self~symbols~dimension(2)
+
+
+::method addMessage
+ use strict arg message
+ self~messages~append(message)
+
+
+::method addError
+ use strict arg message
+ self~addMessage(message)
+ self~errorCount += 1
+
+
+::method findEntries private
+ use strict arg -- none
+ lastLabel = "" -- A comment whose last character is ":" is considered as a label (will be assigned to the next syntax diagram or fragment)
+ /* 1
+ entry#1 2 >>-open(-+---------------+-,subkey-+--------------+-)----------><
+ 3 +-parent_handle-+ +-,-| access |-+
+ 4
+ entry#2 5 access:
+ 6
+ 7 +-ALL-+
+ entry#3 8 |--+-----+------------------------------------------------------>
+ ... 9 ...
+ */
+ containsSyntaxDiagram = .false
+ do line = 1 to self~lineDimension
+ symbol = self[line, 1]
+ if symbol == .nil then iterate
+ BOL = symbol~l2rText~verify(" ") -- position of first non-space character
+ if BOL == 0 then iterate -- blank line
+ symbol = self[line, BOL]
+ select
+ when .BeginingOfStatement~match(symbol) then do
+ token = .BeginingOfStatement~tokenize(symbol)
+ if token <> .nil then do
+ if lastLabel <> "" then token~label = lastLabel
+ self~mainEntries~append(token)
+ containsSyntaxDiagram = .true
+ end
+ lastLabel = ""
+ end
+ when .BeginingOfFragment~match(symbol) then do
+ token = .BeginingOfFragment~tokenize(symbol)
+ if token <> .nil then do
+ if lastLabel <> "" then token~label = lastLabel
+ self~mainEntries~append(token)
+ containsSyntaxDiagram = .true
+ end
+ lastLabel = ""
+ end
+ when .ContinuedFromPreviousLine~match(symbol) then do
+ token = .ContinuedFromPreviousLine~tokenize(symbol)
+ if token <> .nil then self~continuationEntries~append(token)
+ lastLabel = ""
+ end
+ when .Comment~match(symbol) then do
+ token = .Comment~tokenize(symbol)
+ if token <> .nil then do
+ self~mainEntries~append(token)
+ if token~text~right(1) == ":" then lastLabel = token~text
+ end
+ end
+ otherwise nop
+ end
+ end
+ return containsSyntaxDiagram
+
+
+::method checkFirstLineVersusOtherLines private
+ /*
+ Not sure I need to check that, but...
+ If the first line is not empty, then all other lines must be empty.
+ Why ? because the first line is on the same line as <![CDATA[, so does not start on the same column as next lines.
+ <![CDATA[>>-myfunc(-->
+ >--)--><]]>
+ A multi-line diagram which includes the first line could be a mess...
+ */
+ use strict arg -- none
+ if self[1,1]~l2rText~strip <> "" then do
+ do line = 2 to self~lineDimension
+ if self[line, 1]~l2rText~strip <> "" then do
+ self~addError("[error] The first line is not empty : all other lines must be empty")
+ return .false
+ end
+ end
+ end
+ return .true
+
+
+::method l2rTokenize -- left to right
+ use strict arg previousToken, symbol
+ stop = .false
+ loop until stop
+ token = .nil
+ select
+ when symbol == .nil then stop = .true
+ when symbol~isDummy then stop = .true
+ when symbol~token <> .nil then do
+ -- already tokenized
+ select
+ when symbol~token~isA(.ContinuedFromPreviousLine) then token = symbol~token -- Don't stop if ContinuedFromPreviousLine (to let chain all the tokens)
+ when symbol~token~isA(.t2bCrossroads) | symbol~token~isA(.b2tCrossroads) then do
+ -- this is a fix for the problem described in dumpTextWithPointers
+ token = symbol~token
+ stop = .true
+ end
+ otherwise stop = .true
+ end
+ end
+ when .EndOfStatement~match(symbol) then token = .EndOfStatement~tokenize(symbol)
+ when .EndOfFragment~match(symbol) then token = .EndOfFragment~tokenize(symbol)
+ when .BeginingOfFragmentReference~match(symbol) then token = .BeginingOfFragmentReference~tokenize(symbol)
+ when .EndOfFragmentReference~match(symbol) then token = .EndOfFragmentReference~tokenize(symbol)
+ when .ContinuedOnNextLine~match(symbol) then token = .ContinuedOnNextLine~tokenize(symbol)
+ when .l2rCrossroads~match(symbol) then token = .l2rCrossroads~tokenize(symbol)
+ when .l2rPath~match(symbol) then token = .l2rPath~tokenize(symbol)
+ when .Identifier~match(symbol) then token = .Identifier~tokenize(symbol)
+
+ /*when symbol~char = " " then return -- stop horizontal path*/ -- bad idea ! we have some cases like "+--, precision, --+"
+
+ when symbol~char = "+" then stop = .true -- not an l2rCrossroads, stop horizontal path (will be processed by b2tTokenize or t2bTokenize)
+
+ when .l2rCorner~match(symbol) then do
+ token = .l2rCorner~tokenize(symbol)
+ stop = .true
+ end
+
+ -- keep it after l2rpath : "-2-" is Number 2, not -2. But "- -2-" is -2.
+ -- keep it after "+" : "-+2-" is end of path followed by 2 (illegal), not Number +2. But "- +2-" is +2.
+ when .Number~match(symbol) then token = .Number~tokenize(symbol)
+
+ when .EscapedCharacter~match(symbol) then token = .EscapedCharacter~tokenize(symbol)
+ otherwise token = .Character~tokenize(symbol)
+ end
+ .Token~l2rChain(previousToken, token)
+ if token == .nil then stop = .true
+ else do
+ previousToken = token
+ symbol = token~nextSymbol
+ end
+ end
+
+
+::method b2tTokenize -- bottom to top
+ use strict arg previousToken, symbol
+ stop = .false
+ loop until stop
+ token = .nil
+ select
+ when symbol == .nil then stop = .true
+ when symbol~isDummy then stop = .true
+ when symbol~token <> .nil then do
+ -- already tokenized
+ if symbol~token~isA(.l2rCorner) then do
+ -- needed for chain correctly
+ token = symbol~token
+ end
+ stop = .true
+ end
+ when .b2tPath~match(symbol) then token = .b2tPath~tokenize(symbol)
+ when .b2tCrossroads~match(symbol) then token = .b2tCrossroads~tokenize(symbol)
+ when .b2tCorner~match(symbol) then do
+ token = .b2tCorner~tokenize(symbol)
+ stop = .true
+ end
+ otherwise stop = .true
+ end
+ .Token~b2tChain(previousToken, token)
+ if token == .nil then stop = .true
+ else do
+ previousToken = token
+ symbol = token~nextSymbol
+ end
+ end
+
+
+::method t2bTokenize -- top to bottom
+ use strict arg previousToken, symbol
+ stop = .false
+ loop until stop
+ token = .nil
+ select
+ when symbol == .nil then stop = .true
+ when symbol~isDummy then stop = .true
+ when symbol~token <> .nil then do
+ -- already tokenized
+ if symbol~token~isA(.l2rCorner) then do
+ -- needed for chain correctly
+ token = symbol~token
+ end
+ stop = .true
+ end
+ when .t2bPath~match(symbol) then token = .t2bPath~tokenize(symbol)
+ when .t2bCrossroads~match(symbol) then token = .t2bCrossroads~tokenize(symbol)
+ when .t2bCorner~match(symbol) then do
+ token = .t2bCorner~tokenize(symbol)
+ stop = .true
+ end
+ otherwise stop = .true
+ end
+ .Token~t2bChain(previousToken, token)
+ if token == .nil then stop = .true
+ else do
+ previousToken = token
+ symbol = token~nextSymbol
+ end
+ end
+
+
+::method checkTokenizationCompletness private
+ use strict arg -- none
+ self~tokenizationIsComplete = .false
+ do line = 1 to self~lineDimension
+ do col = 1 to self~colDimension
+ symbol = self~symbols[line, col]
+ if symbol == .nil then iterate
+ if symbol~char == " " then iterate
+ if symbol~token == .nil then return .false
+ end
+ end
+ self~tokenizationIsComplete = .true
+ return .true
+
+
+::method checkVisitCompletness
+ use strict arg -- none
+ self~visitIsComplete = .false
+ do token over self~tokens
+ if \ token~alreadyVisited then return .false
+ end
+ self~visitIsComplete = .true
+ return .true
+
+
+::method clearVisitMarks
+ use strict arg -- none
+ do token over self~tokens
+ token~clearVisitMark
+ end
+
+
+::method dumpColumnRuler private
+ use strict arg stream = (.IndentedStream~stdout), showPointers = .false
+ pointerSpace = ""
+ if showPointers then pointerSpace = " "
+ -- column ruler, 1st line : one digit each ten
+ stream~charout(" "~copies(self~lineDigits + 1))
+ do col = 1 to self~colDimension
+ if col // 10 == 0 then stream~charout(col // 100 % 10 || pointerSpace)
+ else stream~charout(" "pointerSpace)
+ end
+ stream~lineout("")
+ -- column ruler, 2nd line : digits 1..9
+ stream~charout(" "~copies(self~lineDigits + 1))
+ do col = 1 to self~colDimension
+ stream~charout(col // 10 || pointerSpace)
+ end
+ stream~lineout("")
+
+
+::method dumpText
+ /*
+ 1 2 3 4 5 6
+ 12345678901234567890123456789012345678901234567890123456789012345
+ 1
+ 2 >>-STATEMENT--+---------------+--------------------------------><
+ 3 +-optional_item-+
+ */
+ use strict arg stream = (.IndentedStream~stdout)
+ self~dumpColumnRuler(stream)
+ do line = 1 to self~lineDimension
+ stream~charout(line~format(self~lineDigits)" ")
+ stream~lineout(self[line,1]~l2rText)
+ end
+
+
+::method dumpNotTokenizedSymbols
+ use strict arg stream = (.IndentedStream~stdout)
+ self~dumpColumnRuler(stream)
+ -- lines of the syntax diagram
+ do line = 1 to self~lineDimension
+ stream~charout(line~format(self~lineDigits)" ")
+ do col = 1 to self~colDimension
+ symbol = self~symbols[line, col]
+ select
+ when symbol == .nil then stream~charout(" ")
+ when symbol~token == .nil then stream~charout(symbol~char)
+ otherwise stream~charout(".")
+ end
+ end
+ stream~lineout("")
+ end
+
+
+::method dumpTokenizedSymbols
+ use strict arg stream = (.IndentedStream~stdout), inspectTokens = .false
+ if inspectTokens then inspectedTokens = .IdentityTable~new
+ do line = 1 to self~lineDimension
+ do col = 1 to self~colDimension
+ symbol = self~symbols[line, col]
+ select
+ when symbol == .nil then nop
+ when symbol~token == .nil then nop
+ otherwise do
+ stream~charout(symbol~string)
+ stream~charout(", tokenized as ")
+ if \ symbol~token~isA(.l2rPath), inspectTokens, inspectedTokens[symbol~token] == .nil then do
+ -- to reduce the size of the dump, I don't inspect l2Path
+ -- the first occurrence of other tokens is inspected
+ stream~lineout("")
+ stream~indent
+ symbol~token~inspect(stream)
+ stream~dedent
+ inspectedTokens[symbol~token] = .true -- mark inspected
+ end
+ else do
+ -- next occurrences are just shortly described
+ stream~charout(symbol~token~string)
+ stream~lineout("")
+ end
+ end
+ end
+ end
+ end
+
+
+::method dumpNotVisitedTokens
+ use strict arg stream = (.IndentedStream~stdout)
+ self~dumpColumnRuler(stream)
+ -- lines of the syntax diagram
+ do line = 1 to self~lineDimension
+ stream~charout(line~format(self~lineDigits)" ")
+ do col = 1 to self~colDimension
+ symbol = self~symbols[line, col]
+ char = symbol~char
+ if char == " " then char = "#" -- to see something
+ select
+ when symbol == .nil then stream~charout(" ")
+ when symbol~token == .nil then stream~charout(" ")
+ when symbol~token~alreadyVisited == .false then stream~charout(char)
+ otherwise stream~charout(".")
+ end
+ end
+ stream~lineout("")
+ end
+
+
+::method dumpTextWithPointers
+ /*
+ Same as dumpText, with an additional information : if the token is linked with a
+ neighboring token, then a dot is displayed between the boundary character of the
+ current token and the boundary character of the neighboring token, otherwise a space
+ is displayed. That way, you can visually check if a path is broken somewhere.
+ This applies in the four directions. Since the chaining is bidirectional, we need
+ only to show the rightToken and lowerToken pointers.
+ Ex :
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ 1
+
+ 2 >_>.-.S_T_A_T_E_M_E_N_T.-.-.+.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.+.-
+ . .
+ 3 +.-.o_p_t_i_o_n_a_l___i_t_e_m.- +
+
+ In this example, we can see that the path is broken after optional_item (now fixed)
+ */
+ use strict arg stream = (.IndentedStream~stdout)
+ self~dumpColumnRuler(stream, .true)
+ -- lines of the syntax diagram
+ do line = 1 to self~lineDimension
+ -- horizontal pointers
+ stream~charout(line~format(self~lineDigits)" ")
+ do col = 1 to self~colDimension
+ symbol = self~symbols[line, col]
+ select
+ when symbol == .nil then stream~charout(" ")
+ when symbol~token == .nil then stream~charout(symbol~char" ")
+ otherwise do
+ stream~charout(symbol~char)
+ if symbol~isLastHorizontalSymbolOfToken then do
+ if symbol~token~rightToken == .nil then stream~charout(" ")
+ else stream~charout(".")
+ end
+ else stream~charout("_") -- glue between the characters of the token
+ end
+ end
+ end
+ stream~lineout("")
+ -- vertical pointers, below the symbol
+ stream~charout(" "~copies(self~lineDigits+1))
+ do col = 1 to self~colDimension
+ symbol = self~symbols[line, col]
+ select
+ when symbol == .nil then stream~charout(" ")
+ when symbol~token == .nil then stream~charout(" ")
+ otherwise do
+ if symbol~isLastVerticalSymbolOfToken then do
+ if symbol~token~lowerToken == .nil then stream~charout(" ")
+ else stream~charout(". ")
+ end
+ else stream~charout("|") -- glue between the characters of the token
+ end
+ end
+ end
+ stream~lineout("")
+ end
+
+
+::method dump
+ use strict arg stream = (.IndentedStream~stdout)
+ self~dumpText(stream)
+ stream~lineout("")
+ -- main entries
+ stream~lineout("Main entries :")
+ do entry over self~mainEntries
+ stream~lineout(entry~string)
+ end
+ stream~lineout("")
+ -- continuation entries
+ stream~lineout("Continuation entries :")
+ do entry over self~continuationEntries
+ stream~lineout(entry~string)
+ end
+ stream~lineout("")
+ -- diagnostic
+ if self~errorCount == 0 then stream~lineout("Tokenization : OK")
+ else stream~lineout("Tokenization : KO")
+ do m over self~messages
+ stream~lineout(m)
+ end
+ stream~lineout("")
+ -- symbols not tokenized
+ if self~tokenizationIsComplete == .false then do
+ stream~lineout("Symbols not tokenized :")
+ self~dumpNotTokenizedSymbols(stream)
+ stream~lineout("")
+ end
+ -- symbols tokenized
+ stream~lineout("Symbols tokenized :")
+ self~dumpTokenizedSymbols(stream, .true)
+ stream~lineout("")
+ -- tokens not visited
+ if self~visitIsComplete == .false then do
+ stream~lineout("Tokens not visited :")
+ self~dumpNotVisitedTokens(stream)
+ stream~lineout("")
+ end
+
+
+::method inspect -- for debug
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout(self~string" :")
+ stream~lineout("")
+ self~dump(stream)
+
+
+-------------------------------------------------------------------------------
+::class "Symbol"
+::attribute char
+::attribute col
+::attribute isDummy -- true if created on the fly
+::attribute line
+::attribute tokenizer
+::attribute token
+
+-- Text in the four directions
+
+::attribute l2rText get -- left to right
+ expose l2rText
+ if var(l2rText) then return l2rText
+ l2rText = .MutableBuffer~new
+ symbol = self
+ do while symbol <> .nil
+ l2rText~append(symbol~char)
+ symbol = symbol~rightSymbol
+ end
+ l2rText = l2rText~string
+ return l2rText
+
+
+::attribute r2lText get -- right to left
+ expose r2lText
+ if var(r2lText) then return r2lText
+ r2lText = .MutableBuffer~new
+ symbol = self
+ do while symbol <> .nil
+ r2lText~append(symbol~char)
+ symbol = symbol~leftSymbol
+ end
+ r2lText = r2lText~string
+ return r2lText
+
+
+::attribute b2tText get -- bottom to top
+ expose b2tText
+ if var(b2tText) then return b2tText
+ b2tText = .MutableBuffer~new
+ symbol = self
+ do while symbol <> .nil
+ b2tText~append(symbol~char)
+ symbol = symbol~upperSymbol
+ end
+ b2tText = b2tText~string
+ return b2tText
+
+
+::attribute t2bText get -- top to bottom
+ expose t2bText
+ if var(t2bText) then return t2bText
+ t2bText = .MutableBuffer~new
+ symbol = self
+ do while symbol <> .nil
+ t2bText~append(symbol~char)
+ symbol = symbol~lowerSymbol
+ end
+ t2bText = t2bText~string
+ return t2bText
+
+
+::method init
+ use strict arg char, tokenizer, line, col, isDummy=.false
+ self~char = char
+ self~col = col
+ self~isDummy = isDummy
+ self~token = .nil
+ self~line = line
+ self~tokenizer = tokenizer
+ self~objectName = "The '"self~char"' "self~class~id" at ["self~line","self~col"]"
+
+
+::method leftSymbol
+ use strict arg -- none
+ return self~tokenizer[self~line, self~col - 1]
+
+
+::method rightSymbol
+ use strict arg -- none
+ return self~tokenizer[self~line, self~col + 1]
+
+
+::method upperSymbol
+ use strict arg -- none
+ return self~tokenizer[self~line - 1, self~col]
+
+
+::method lowerSymbol
+ use strict arg -- none
+ return self~tokenizer[self~line + 1, self~col]
+
+
+::method setToken
+ use strict arg token
+ if self~token == token then return .true
+ if self~token <> .nil then do
+ -- Should never happen, but...
+ self~tokenizer~addError( "[error] Trying to assign two different tokens to "self" :")
+ self~tokenizer~addMessage(" current : "self~token)
+ self~tokenizer~addMessage(" new : "token)
+ return .false
+ end
+ self~token = token
+ return .true
+
+
+::method isLastHorizontalSymbolOfToken
+ use strict arg -- none
+ token = self~token
+ if token == .nil then return .false
+ text = textWithoutMetaChar(token~text) -- todo : should find a way to not call textWithoutMetaChar, because may bring troubles for Comments
+ select
+ when token~isA(.l2rToken) then do
+ if self~col == (token~firstSymbol~col + text~length - 1) then return .true
+ end
+ when token~isA(.r2lToken) then do
+ if self~col == (token~firstSymbol~col - text~length + 1) then return .true
+ end
+ when token~isA(.t2bToken) then return .true
+ when token~isA(.b2tToken) then return .true
+ otherwise nop -- should never reach here
+ end
+ return .false
+
+
+::method isLastVerticalSymbolOfToken
+ use strict arg -- none
+ token = self~token
+ if token == .nil then return .false
+ select
+ when token~isA(.l2rToken) then return .true
+ when token~isA(.r2lToken) then return .true
+ when token~isA(.t2bToken) then do
+ if self~line == (token~firstSymbol~line + token~text~length - 1) then return .true
+ end
+ when token~isA(.b2tToken) then do
+ if self~line == (token~firstSymbol~line - token~text~length + 1) then return .true
+ end
+ otherwise nop -- should never reach here
+ end
+ return .false
+
+
+::method inspect -- for debug
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout(self~string)
+ stream~indent
+ self~inspectProperties(stream)
+ stream~dedent
+
+
+::method inspectProperties
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout("l2rText = '"self~l2rText"'")
+ stream~lineout("r2lText = '"self~r2lText"'")
+ stream~lineout("b2tText = '"self~b2tText"'")
+ stream~lineout("t2bText = '"self~t2bText"'")
+
+
+-------------------------------------------------------------------------------
+::class "Token"
+::attribute annotation -- used by the parser, to associate parsing information
+::attribute firstSymbol
+::attribute nextSymbol -- The next symbol to analyze, in the main direction of the token
+::attribute text -- The text of the token, either defined (overriden) by a constant, or calculated
+::attribute text class -- The text to match in order to recognize the token, either defined (overriden) by a constant, or not used
+::attribute tokenizer
+
+-- For navigation in the network of tokens
+::attribute leftToken get
+ expose leftToken
+ if self~tokenizer~dereferenceContinuations &,
+ leftToken~isA(.ContinuedFromPreviousLine) then return leftToken~leftToken~leftToken -- dereference the continuation
+ return leftToken
+::attribute leftToken set
+::attribute lowerToken
+::attribute rightToken get
+ expose rightToken
+ if self~tokenizer~dereferenceContinuations &,
+ rightToken~isA(.ContinuedOnNextLine) then return rightToken~rightToken~rightToken -- dereference the continuation
+ return rightToken
+::attribute rightToken set
+::attribute upperToken
+
+
+::method init class
+ self~text = ""
+
+
+::method init
+ use strict arg symbol
+ self~annotation = .nil
+ self~firstSymbol = symbol
+ self~leftToken = .nil
+ self~lowerToken = .nil
+ self~nextSymbol = .nil
+ self~rightToken = .nil
+ self~text = ""
+ self~tokenizer = symbol~tokenizer
+ self~upperToken = .nil
+ self~updateObjectName
+ self~clearVisitMark -- will create an entry in the table of tokens
+
+
+::method updateObjectName
+ -- Some subclasses assign a value to self~text only when tokenizing.
+ -- In this case, this method must be reexecuted after the assignment.
+ self~objectName = "The '"self~text"' "self~class~id" at ["self~firstSymbol~line","self~firstSymbol~col"]"
+
+
+::method value -- some subclasses will redefine this method. Sometimes, the matched text must be stripped.
+ use strict arg -- none
+ return self~text
+
+
+::method l2rChain class
+ use strict arg from, to
+ if from == .nil | to == .nil then return .false
+ error = .false
+ if from == to then do
+ -- Should never happen, but...
+ tokenizer = from~tokenizer
+ tokenizer~addError( "[error] Trying to chain "from" to itself")
+ error = .true
+ end
+ if from~rightToken <> .nil & from~rightToken <> to then do
+ -- Should never happen, but...
+ tokenizer = from~tokenizer
+ tokenizer~addError( "[error] Trying to assign two different right tokens to "from" :")
+ tokenizer~addMessage(" current : "from~rightToken)
+ tokenizer~addMessage(" new : "to)
+ error = .true
+ end
+ if to~leftToken <> .nil & to~leftToken <> from then do
+ -- Should never happen, but...
+ tokenizer = to~tokenizer
+ tokenizer~addError( "[error] Trying to assign two different left tokens to "to" :")
+ tokenizer~addMessage(" current : "to~leftToken)
+ tokenizer~addMessage(" new : "from)
+ error = .true
+ end
+ if error then return .false
+ from~rightToken = to
+ to~leftToken = from
+ return .true
+
+
+::method b2tChain class
+ use strict arg from, to
+ if from == .nil | to == .nil then return .false
+ error = .false
+ if from~upperToken <> .nil & from~upperToken <> to then do
+ -- Should never happen, but...
+ tokenizer = from~tokenizer
+ tokenizer~addError( "[error] Trying to assign two different upper tokens to "from" :")
+ tokenizer~addMessage(" current : "from~upperToken)
+ tokenizer~addMessage(" new : "to)
+ error = .true
+ end
+ if to~lowerToken <> .nil & to~lowerToken <> from then do
+ -- Should never happen, but...
+ tokenizer = to~tokenizer
+ tokenizer~addError( "[error] Trying to assign two different lower tokens to "to" :")
+ tokenizer~addMessage(" current : "to~lowerToken)
+ tokenizer~addMessage(" new : "from)
+ error = .true
+ end
+ if error then return .false
+ from~upperToken = to
+ to~lowerToken = from
+ return .true
+
+
+::method t2bChain class
+ use strict arg from, to
+ if from == .nil | to == .nil then return .false
+ error = .false
+ if from~lowerToken <> .nil & from~lowerToken <> to then do
+ -- Should never happen, but...
+ tokenizer = from~tokenizer
+ tokenizer~addError( "[error] Trying to assign two different lower tokens to "from" :")
+ tokenizer~addMessage(" current : "from~lowerToken)
+ tokenizer~addMessage(" new : "to)
+ error = .true
+ end
+ if to~upperToken <> .nil & to~upperToken <> from then do
+ -- Should never happen, but...
+ tokenizer = to~tokenizer
+ tokenizer~addError( "[error] Trying to assign two different upper tokens to "to" :")
+ tokenizer~addMessage(" current : "to~upperToken)
+ tokenizer~addMessage(" new : "from)
+ error = .true
+ end
+ if error then return .false
+ from~lowerToken = to
+ to~upperToken = from
+ return .true
+
+
+::method visit
+ use strict arg -- none
+ if \ self~alreadyVisited then do
+ self~setVisitMark
+ -- leftToken is not visited because not needed to ensure completness
+ if self~rightToken <> .nil then self~rightToken~visit
+ if self~upperToken <> .nil then self~upperToken~visit
+ if self~lowerToken <> .nil then self~lowerToken~visit
+ end
+
+
+::method setVisitMark
+ use strict arg -- none
+ self~tokenizer~tokens[self] = .true
+
+
+::method clearVisitMark
+ use strict arg -- none
+ self~tokenizer~tokens[self] = .nil
+
+
+::method alreadyVisited
+ use strict arg -- none
+ return self~tokenizer~tokens[self] <> .nil
+
+
+::method inspect -- for debug
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout(self~string)
+ stream~indent
+ self~inspectProperties(stream)
+ stream~dedent
+
+
+::method inspectProperties
+ use strict arg stream = (.IndentedStream~stdout)
+ stream~lineout("firstSymbol = "self~firstSymbol)
+ stream~lineout("nextSymbol = "self~nextSymbol)
+ stream~lineout("text = '"self~text"'")
+ -- stream~lineout("tokenizer = "self~tokenizer)
+ stream~lineout("leftToken = "self~leftToken)
+ stream~lineout("lowerToken = "self~lowerToken)
+ stream~lineout("rightToken = "self~rightToken)
+ stream~lineout("upperToken = "self~upperToken)
+
+
+-------------------------------------------------------------------------------
+-- Left to right token
+::class "l2rToken" mixinclass "Token"
+
+
+::method match class
+ use strict arg symbol
+ if symbol == .nil then return .false
+ if symbol~token~isA(self) then return .true
+ if textHasBOL(self~text) then do
+ if symbol~leftSymbol <> .nil, symbol~leftSymbol~r2lText~strip <> "" then return .false
+ end
+ if textHasEOL(self~text) then do
+ if symbol~l2rText~substr(textWithoutMetaChar(self~text)~length + 1)~strip <> "" then return .false
+ end
+ if symbol~l2rText~pos(textWithoutMetaChar(self~text)) == 1 then return .true
+ return .false
+
+
+::method tokenize class
+ use strict arg symbol
+ if self~match(symbol) then do
+ if symbol~token <> .nil then token = symbol~token -- already tokenized
+ else token = self~new(symbol)
+ -- Mark each symbol of the token (ensure consistency if already tokenized)
+ do i = 1 to textWithoutMetaChar(self~text)~length
+ if symbol~setToken(token) == .false then return .nil
+ symbol = symbol~rightSymbol
+ end
+ token~nextSymbol = symbol
+ return token
+ end
+ return .nil
+
+
+::routine textHasBOL
+ use strict arg text
+ return text~left(5) == "[BOL]"
+
+
+::routine textHasEOL
+ use strict arg text
+ return text~right(5) == "[EOL]"
+
+
+::routine textWithoutMetaChar
+ use strict arg text
+ start = 1
+ if textHasBOL(text) then start = 6
+ length = text~length - start + 1
+ if textHasEOL(text) then length -= 5
+ return text~substr(start, length)
+
+
+-------------------------------------------------------------------------------
+-- Bottom to top token
+::class "b2tToken" mixinclass "Token"
+
+
+::method match class
+ use strict arg symbol
+ if symbol == .nil then return .false
@@ Diff output truncated at 100000 characters. @@
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|