BNFA is C++ regular expression matcher based on non-deterministic finite automata (NFA).
BNFA provides a way to construct the regular expression rules, and an engine for comparing input against the regular expressions.
BNFA uses overloaded operators for writing regular expressions. This syntax was inspired by Boost.Spirit (www.boost.org).
| Element | Regular expression | BNFA |
|---|---|---|
| Optional | A? | -A |
| One-or-more | A+ | +A |
| Zero-or-more | A* | *A |
| Exactly N times | A{N} | A(N) |
| At least N times | A{N,} | A(N,infinity) |
| Between M and N times | A{M,N} | A(M,N) |
| Concatenation | AB | A >> B |
| Alternation | A | B | A | B |
| Separated list | A(?:BA)* | A % B |
| Capture | (A) | capture(A) |
| Positive lookahead | (?=A) | &A |
| Negative lookahead | (?!A) | !A |
| Character | a | text("a") |
| Any character | . | any() |
| Character sequence | abc | text("abc") |
| Character group | [abc] | group("abc") |
| Character range | [a-z] | range('a', 'z') |
| Character class | [:alnum:] [:alpha:] ... | alnum()... |
rule vowel = group("AEIOUYaeiouy");
rule vowels = +vowel;
rule identifier1 = alpha() >> *alnum(); // Alternative #1
rule identifier2 = !digit() >> +alnum(); // Alternative #2
rule floating = -group("+-") >> ( ( *digit() >> text(".") >> +digit() ) | +digit() );
### Matching Policies ### The BNFA engine supports three different matching policies: * Match - Examines if the input matched the rule. * Capture - Returns a list with parts of the input that matches the captures specified in the rule. * Lookup - Look up data that matches the input. This works like a radix tree, but where the radix tree only handles prefixes, this matcher handles arbitrary regular expressions.