Re: [tcljava-user] jacl 1.3.1 regexp

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Johannes Kleinlercher wrote:
> Hi all,
>
> I look for a way to parse an XML-File in IBMs wsadmin, which uses jacl
> 1.3.1 as a scripting language.
>
> I found a project called sa4was [1] on IBMs developerworks which does
> parsing XML with regexp. sa4was works with WebSpher 5.x (which used jacl
> 1.2.x) however in WebSphere 6.0 (which uses jacl 1.3.1) it doesn't work
> anymore.
>
> Some code in sa4was does the following:
>
> ============================================
>     while {[regexp {([^=]+)="([^"]*?)"(.*)} $restOfTag dontCare
> attributeName attributeValue restOfTag]} {
>         if {$attributeName == "id"} {
>             set idValue $attributeValue
>             break
>         }
>     }
> ============================================
>
> and there I get the error 
> "couldn't compile regular expression pattern: nested *?+"
>
>
> I found out that regexp changed in jacl 1.3.1 and non-greedy regexp
> (with this "*?" expression) doesn't work anymore. Is that right?  
>
> Questions:
> a) So are there workarounds for something like that? 
> b) Or are there some better ways to parse XML in jacl?
>   
Currently, there is no workaround except porting regexp code back to use 
the older Tcl 8.1
style regexp syntax. Very old versions of Jacl made use or the Oro 
regexp package, but it
was non free and had to be removed from Jacl.

But, lets back up a minute. I would not even suggest that you use regexp 
commands to parse
your XML code. Using a series of regexp call like that is going to be 
SLOW! There is just
no way around it, you would be much better off using a XML parsing 
engine written in Java.
There are lots and lot of them available. You could create your XML 
parser in Java and build
up a DOM in memory and then pass a handle to the DOM back to your Tcl 
scripts. It is actually
quite easy to examine and decode a DOM tree with Tcl code once you have 
your own utility
procs that examine subtrees and extract data from nodes.

I hope that helps
Mo DeJong