On Fri, Jan 4, 2013 at 10:05 AM, Kirk, Benjamin (JSC-EG311) <benjamin.kirk-1@nasa.gov> wrote:
On Jan 4, 2013, at 11:00 AM, Cody Permann <codypermann@gmail.com> wrote:

> Yes a very decent idea, however would we attempt to distribute boost::regex with libMesh?  I haven't looked at it yet, but fear that it might be one of those features in boost that requires 1 bajillion header files.  If it's only a few, then this would work nicely. :)

Even worse:

So no, we probably wouldn't distribute it, but certainly could take advantage of it if it is there - giving your app a fallback option in case you need an older compiler.

I thought I'd send a little update on what I've found when digging in RE land.  The T-Rex library turned out to be a little too underpowered so I'm definitely not going to go that route.  It's lack of non-greedy operators and other odd "bugs" made it very undesirable.  I had a number of RE cases that it wasn't able to handle that I can run through Perl and Python without issues.

On Tuesday, I messed around a bit trying to get the <regex> library to work and had no success on my Mac with GCC 4.7.2.  The c++11 status page says "partial" for many of the regex specific pieces so it might be some time before we'd even want to think about trying to use that portion of the library in libMesh.  Everything compiled, but it just isn't fully implemented yet so some of the functions just return without doing anything.  I was unable to get even simple examples from the web to run...

The good news is that I have found a very nice RE solution: http://pcre.sourceforge.net/
As the name implies it a full Perl-Compatible-RE implementation, so I've hit the mother-load.  The build system and organization of the source is very well done so it can be redistributed with another library with little effort.  Following the "no-configure" instructions, I renamed a coupled of generic pre-generated header files and then added the source to a contrib directory in MOOSE.  Other than one extra "-D" option, it compiled with the normal libMesh compile rules.  If you aren't concerned with full Unicode support then it's also very small.  It only contains about a dozen C files for the library itself and compiles in a second or two.  I've cranked through some very nasty RegExs and it's working flawlessly!  It support capturing, named capture, anchors, atomics, all the advanced stuff!  Oh and the license is BSD.

It includes a few C++ wrappers too so I might be able to use it as is, but I might wrap it so that I can tweak it a bit.  Anyway, this is exactly what I need to accomplish the advanced parsing that I need.  I'm going to continue working with it in MOOSE for right now, but I'd be happy to push it up to libMesh if there is interest.  If it ends up in libMesh, I think that wrapping the library will be the right way to move forward so that we could eventually use the standard regex library once it is more widely available.

BTW, have you looked at Cantera's XML input format and parsing?  Not that I love XML or anything, but in the interest in reusing existing capability…