Re: [Pyparsing] Efficency of Keyword (and a couple other bits)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Slightly non-scientific since I didn't adjust for varying loads on the
machine or disk caches, but running with and without sort both gave 85.6
processing time.  Looks like the key was choosing to use the re module,
and sorting didn't help.  Interesting :)

Corrin

PS: matching the building name using ~(UNIT_TYPE) didn't work... Since
that didn't give me the option to go .setResultsName.  What I've got at
the moment is:
	BUILDING =3D
OneOrMore(Word(nameletters)).setParseAction(rejectBuildingName).setResul
tsName("BuildingName")

And the parse action is:
    def rejectBuildingName(string,loc,tokens):=20
	""" Prevent building name of LWR GROUND and similar """
        building_name =3D ""
        for token in tokens:
            if token =3D=3D self.SEPCHAR:
                if debug:
                    print "Rejected the building name " + building_name
                raise ParseException(string,loc,"found a field seperator
in the building")
            if building_name <> "":
                building_name +=3D " "
            building_name +=3D token
        if self.debug:
            print "Trying to reject the building name
%s"%(building_name)
        if spare_parser <> None:
            r =3D None
            try:
                r =3D
spare_parser.NOT_A_BUILDING.parseString(building_name)
            except ParseException, pe:
                r =3D r
            if r =3D=3D None:
                if debug:
                    print "Looks like this building is not a floor or a
unit"
            else:
                if debug:
                    print "Rejected as this building looks like a floor
or a unit"
                raise ParseException(string,loc,"Rejected %s as a
building name - looks like a floor or a unit" % string)
        if debug:
            print "Looks like this building is okay"

        NOT_A_BUILDING =3D (UNIT_TYPE | FLOOR | BOX_LINE | BAG_LINE)

It feels like a very roundabout way of doing it to me, though it seems
to work well enough.

-----Original Message-----
From: Ralph Corderoy [mailto:ra...@in...]=20
Sent: Thursday, March 22, 2007 12:11 AM
To: Corrin Lakeland
Cc: pyp...@li...
Subject: Re: [Pyparsing] Efficency of Keyword (and a couple other bits)=20

Hi Corrin,

I'm glad you've got the speed-up you were after.  Out of interest, how
does the Regexp with the alternatives sorted by frequency compare with
the Regexp with the alternatives sorted by reverse frequency?  This
would show if the re module is optimising without you needing to sort.

Cheers,

Ralph.