compact-regex Code

C89/ANSI-C compatible extension for the regex.h library

Brought to you by: nowca

Tree [3b26a5] main /

History

HTTPS access

File	Date	Author	Commit
documentation	2025-07-06	Jonathan Nowca	[6a12f3] Initial commit
examples	2025-07-19	Jonathan Nowca	[14298b] Update
include	2025-08-14	Jonathan Nowca	[3b26a5] Update
.gitattributes	2025-07-15	Jonathan Nowca	[879dde] Update
.gitignore	2025-07-19	Jonathan Nowca	[14298b] Update
Makefile	2025-08-14	Jonathan Nowca	[3b26a5] Update
README.md	2025-08-14	Jonathan Nowca	[3b26a5] Update
compact-regex_usage.c	2025-08-14	Jonathan Nowca	[3b26a5] Update

Read Me

compact-regex.h

This is a C89/ANSI-C compatible extension for the regex.h POSIX/GNU Regular Expression library from 1992.

It is designed as a very simple and modern extension wrapper library for fast implementation of Regex in C or C-compatible projects
like prototyping or basic tasks with regular expressions.

The program is using the POSIX-compatible Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE) standard by the Open Group with extended functions:

simple and intuitive usage in C sourcecode
easy implementation
global search, multiple options
better matching of subexpressions
use of single-letter character classes like \d or \s
handling of ASCII- and Unicode escape sequences
string replacement with group substitution
formatted terminal output functions
file read processing and export functions

Note: The library is not thread-safe and not Perl Regex compatible, it's currently just POSIX Regex with extensions.

CREG

The command line application CREG using the compact-regex.h extensions library: https://github.com/nowca/creg

Project files
- Library files
- Example and usage files
Compilation
- Linux
- Windows
- MacOS
Integration into your project
How to use
Supported Regular Expression operations
Program functions
Program functions for filesystem
Program options
RegEx Object
POSIX Standard
- Basic Regular Syntax
- Extended Regular Syntax
Character classes
ASCII and Unicode sequences
- ASCII codes
- Unicode
Matching subexpressions
Replace strings with regular expressions
File reading and writing

Project files

Library files

include/compact-regex.h
- Main header file which can be included in your project

inculde/compact-regex.c
- Related source code file

Example and usage files

compact-regex_usage.c
- Basic usage examples of compact-regex.h

examples/compact-regex_examples.c
- Many basic regular expression examples

examples/compact-regex_file-reading.c
- File reading and writing with compact-regex.h

Additional files

documentation/Regex - edition 0.12a - 1992.pdf
- Official documentation of regex.h from 1992

examples/example-text-files/*
- Some example text files for parsing

Compilation

Linux

Build the example program by typing in:

user@pc:~$ make

...or compile it directly with the GNU-C-Compiler:

user@pc:~$ gcc -Wall compact-regex_usage.c -o compact-regex_usage

The GNU Extensions with the regex.h library are needed for successful compilation. Please take care of including the neccesary header and library files.
Note: If you want to compile it with the -static flag, to combine the libraries into the code, there will be some memory leaks showed in valgrind. These errors are supressed on dynamically linking by default. (https://stackoverflow.com/questions/7506134/valgrind-errors-when-linked-with-static-why)

user@pc:~$ gcc -Wall -static compact-regex_usage.c -o compact-regex_usage

Windows

To compile the program on windows, you will need a compiler version with the regex.h library, from GNU extensions included:

C:\Users\pcuser>gcc.exe -static -IC:\MinGW-W64\mingw32\opt\include compact-regex_usage.c -o compact-regex_usage.exe -LC:\MinGW-W64\mingw32\opt\lib -lregex

MinGW-W64 includes the regex.h library in the \opt\include and \opt\lib folders.
The paths of the header and library must be included with -I and -L, with an additional -lregex parameter at the end of the command.
-static can be used to make your project independend from libraries.
The path of gcc.exe must be added to the Windows PATH user-variable

MacOS

To compile the program on MacOS or OS X, you will need a compiler version with the regex.h library, from GNU extensions included:

There are several ways to install the GCC development tools on your Mac:
- Xcode
- Homebrew
- MacPorts
- sourcecode compilation
- graphical package installer like Bower or MacUpdate
You need a GCC installation with the regex.h library (GNU Extensions).
For compiler options see Linux.

Integration into your project

Just include the compact-regex.h file into your project, that's all. (You don't need to tell the gcc compiler the paths of the library files via the command-line)

#include "compact-regex.h"

How to use

Simple use

The most fast and simple usage:

RegEx regex_data = regex_match("abc 123 ABC xyz", "abc", REG_ICASE)
regex_print(regex_data, REGEX_PRINT_TABLE);
regex_close(regex_data);

This means: match the the regular expression "abc" in the input text "abc 123 ABC xyz" with the regex.h option flags for searching with insensitive case option and print the results as a table.

Extended use

A more detailed step by step code usage:

int option_flags = REG_GLOBAL | REG_EXTENDED | REG_ICASE;
char* input_string = "abc";
char* regex_pattern = "abc 123 ABC xyz";

RegEx regex_data = regex_compile(regex_pattern, option_flags);

if (regex_data->return_code == REGEX_COMP_SUCCESS)
{
    regex_exec(input_string, regex_data);

    if (regex_data->return_code == REGEX_MATCH_SUCCESS)
    {
        /* show internals of regex struct as variables */
        printf("Text:\n%s\n\n", regex_data->text);
        printf("Regular-Expression:       %s\n", regex_data->pattern);
        printf("Number of matches:        %d\n", regex_data->num_matches);

        printf("\nResults:\n");

        for (i = 0; i < regex_data->num_matches; i++)
        {
            printf("Result Nubmer %d:\n", i);
            printf("Start-Position:     %d\n", regex_data->matches[i].start);
            printf("End-Position:       %d\n", regex_data->matches[i].end);
            printf("Substring:          %s\n", regex_data->matches[i].string);
        }
    }
}
else
{
    regex_error(regex_data);
}

regex_close(regex_data);

compile the regular expression pattern "abc" with the regex.h option flags
execute the compiled regular expression on the input text "abc 123 ABC xyz", if compilation is successful
print the data on the screen, if execution is successful

Example program files

The example files can be used by just running them or working with program arguments:

compact-regex_usage.c

user@pc:~$ ./compact-regex_usage

This runs the basic example implementation with examples for:
- basic usage and extended usage
- string replacements

user@pc:~$ ./compact-regex_usage "abc 123 XYZ" "\d+"

This searches for all digit words "\d+" in the string "abc 123 XYZ"

user@pc:~$ ./compact-regex_usage "abc 123 XYZ" "\d+" "###"

This searches for all digit words "\d+" in the string "abc 123 XYZ" and replace it with the substring "###", so the result will output "abc ### XYZ"

examples/compact-regex_examples.c

user@pc:~$ ./compact-regex_examples

This contains many basic regular expression examples:
- word length between range
- number value between range
- phone number
- binary number
- hexadecimal number
- repeating numbers
- validate email address
- validate domain name
- validate ip addresses
- validate time format
- validate timestamp
- get youtube url
- special characters
The example function can be selected by a number or typed in as a number as first argument.

examples/compact-regex_file-reading.c

user@pc:~$ ./compact-regex_file-reading

This runs the a file reading and writing example implementation with examples for:
- read of a csv file, with write of table txt-file
- read of a windows formatted reg file, with write of a json-file
- read of a large word list file, with write of plain txt-file

user@pc:~$ ./compact-regex_file-reading Makefile "SRC"

This matches the substring "SRC" in the file "Makefile" and prints the output as a list.

user@pc:~$ ./compact-regex_file-reading Makefile "SRC" "###"

This matches the substring "SRC" in the file "Makefile" and replace all matches with the substring "###".

Supported Regular Expression operations

Supported:	Not supported:
Wildcard `.`	Lazy `*?` `+?` `??`
Character classes `\d` `\D` `\w` `\W`	Negative Lookahead `(?!)`
POSIX character classes `[:digit:]`	Negative Lookbehind `(?<!)`
Whitespace `\s` `\S`	Positive Lookahead `(?<=)`
Character Sets `[abc]`	Positive Lookbehind `(?<=)`
Escaping `\`
The Asterisk `*`
The Plus `+`
The Question Mark `?`
Numeric Quantifier `{n}`
Range Quantifier `{n,m}`
Alternation `\|`
Anchors `^` `$`
Capturing Groups `a(b)c`
Backreferences `\1`
ASCII and Unicode sequences

Program functions

regex_match()

RegEx regex_match(char* input_text_string, char* regex_pattern_string, int OPTION_FLAGS)

Compiles a regular expression pattern with given option flags and executes the regular expression.

Return value: returns the RegEx Object with the regular expression results.

regex_compile()

RegEx regex_compile(char* regex_pattern_string, int OPTION_FLAGS)

Compiles a regular expression pattern with given option flags and returns the RegEx Object for execution processing.

Return value: returns the RegEx Object with the compiled regular expression pattern.

regex_exec()

int regex_exec(char* input_text_string, RegEx regex)

Executes a compiled regular expression pattern and compares it with a given input text string.

Return value: returns the return code of regexec().

regex_close()

void regex_close(RegEx regex)

Frees the memory of allocated library and regex.h buffers.

regex_error()

int regex_error(RegEx regex)

Writes the error code message of regerror() into the error message buffer and prints the error message to stderr.

Return value: returns the return code of regerror().

regex_replace()

char* regex_replace(char* input_text_string, char* regex_pattern_string, char* replace_substring, int OPTION_FLAGS)

Replaces regular expression matches with a replacement substring and the given option flags.

The replacement substring can be used with group references ("\1", "\2", "\3"...)

Return value: The output string with the replaced substring values.

regex_print()

void regex_print(RegEx regex, int PRINT_LAYOUT)

Prints the input text and the regular expression results and contents of a RegEx Object as a table or as a list with a given print layout.

The terminal screen output can be set to colored ANSI output with the global variable PRINT_COLORED

set_default_reg_flags()

void set_default_reg_flags(int OPTION_FLAGS)

Sets the default REG_ option flags for regex_compile() and regcomp()

regex_readfile()

RegExFile regex_readfile(char* file_name)

Reads a file, its attributes and contents by the given filename.

Return value: The RegExFile object

regex_closefile()

void regex_closefile(RegExFile regex_file)

Closes a RegExFile object, frees attributes and contents and its allocated memory

regex_writefile()

int regex_writefile(RegEx regex_data, int PRINT_LAYOUT, char* file_name)

Writes the input text and the regular expression results and contents of a RegEx Object as a table or table into a file with a given print layout.

Return value: Return 1 if the write was successful, or 0 if not

regex_writefile_string()

int regex_writefile_string(char* output_string, char* file_name)

Writes a string into a file.

Return value: Return 1 if the write was successful, or 0 if not

Program options

Option flags

The option flags passed to regex_compile() and regex_match() are processed and passed through to the internal regex.h function regcomp().

According to the regex.h documentation they are:

REG_EXTENDED - Support extended regular expressions.
REG_ICASE - Ignore case in match.
REG_NEWLINE - Eliminate any special significance to the newline character.

Additionaly REG_NEWLINE is documented in Regex - edition 0.12a - 1992 as follows:

match-any-character operator (see Section 3.2 [Match-any-character Operator],
page 9) doesn’t match a newline.

nonmatching list not containing a newline (see Section 3.6 [List Operators],
page 13) matches a newline.REG_NOSUB

match-beginning-of-line operator (see Section 3.9.1 [Match-beginning-of-line Op-
erator], page 18) matches the empty string immediately after a newline, regardless
of how REG_NOTBOL is set (see Section 7.2.3 [POSIX Matching], page 37, for an
explanation of REG_NOTBOL).

match-end-of-line operator (see Section 3.9.1 [Match-beginning-of-line Operator],
page 18) matches the empty string immediately before a newline, regardless of how
REG_NOTEOL is set (see Section 7.2.3 [POSIX Matching], page 37, for an explanation
of REG_NOTEOL).

REG_NOSUB - Report only success or fail in regexec(), that is, verify the syntax of a regular expression. If this flag is set, the regcomp() function sets re_nsub to the number of parenthesized sub-expressions found in pattern. Otherwise, a sub-expression results in an error.

(!) The REG_NOSUB option flag is deactivated in the program.

The compact-regex.h library adds the following additional flags as optional functions:

REG_GLOBAL - Uses global-search with multiple matches instead of single matching
REG_MULTILINE - Catches the newline character, automaticly deactivates REG_NEWLINE
REG_NOSUBEXP - Ignore matching of grouped submatches by subexpressions
REG_SUBEXP - Match only subexpressions

You can use them directly as function arguments like this:

regex_compile("[a-z]*", REG_GLOBAL | REG_EXTENDED | REG_ICASE | REG_MULTILINE);

...or you can pass them as a integer variable with the options:

int option_flags = REG_GLOBAL | REG_EXTENDED | REG_ICASE | REG_MULTILINE;

regex_compile("[a-z]*", option_flags);

Default Option flags

The option flags REG_GLOBAL, REG_EXTENDED, REG_NEWLINE are set by default so you don't need to pass them.

If you don't want to search for all occurrences of a substring, use basic regular expression syntax or search with special newline treatment, you can set the default option flags by yourself with:

set_default_reg_flags(int OPTION_FLAGS)

Example:

set_default_reg_flags(REG_GLOBAL | REG_EXTENDED | REG_ICASE);

This will set the option flags REG_GLOBAL, REG_EXTENDED, REG_ICASE to the default option flags for working with the library functions.

regex_compile("[a-z]*", REG_DEFAULT);

This will use the default option flags, even if you don't pass them to the regular expression compiler.

regex_compile("[a-z]*", REG_ICASE);

This would mean: use the default option flags REG_GLOBAL, REG_EXTENDED, REG_NEWLINE with the additional option flag REG_ICASE.

Print layout

The layout flags for printing and file export are:

REGEX_PRINT_TABLE - print as table
REGEX_PRINT_LIST - print as list
REGEX_PRINT_LIST_FULL - print as full list
REGEX_PRINT_PLAIN - just print the results only
REGEX_PRINT_CSV - print the results as CSV-Format (Character-Separated Values)
REGEX_PRINT_JSON - print the results in JSON-Format (JavaScript Object-Notation)

Additional printing layout filter

The print layout flag can be extended with a additional layout filter flag:

REGEX_PRINT_FULLTEXT - print the full input text
REGEX_PRINT_NOTEXT - don't print the input text
REGEX_PRINT_NOSTATS - don't print the regular expression statistic data
REGEX_PRINT_NORESULTS - don't print the results of the regular expression execution
REGEX_PRINT_NOINDEX - don't print the index positions of the results

You can use the layout printing flag and its filters like this:

regex_print(regex_data, REGEX_PRINT_TABLE | REGEX_PRINT_NOTEXT | REGEX_PRINT_NOSTATS);

This prints just the results as a table without the input text string and without the regular expression statistics.

Colored terminal output

The terminal screen output can be set to colored ANSI text:

/* Prints the terminal output in colored ANSI text */
unsigned int PRINT_COLORED = 0;

RegEx Object

The RegEx object contains all the related data of the regular expression process:

cregflags_t flags;          /* status of option flags */
cregmatches_t* matches;     /* array with the match start and end string positions and the substring */
cregfile_t file;            /* file object */
int num_matches;            /* number of matches */
int num_pattern_subexpr;    /* number of corresponding sub-expressions */
int return_code;            /* return code of the expression string compilation */
char* text;                 /* the regular expression input text string */
char* pattern;              /* the regular expression string pattern */
char error_message[128];    /* error message buffer */
regex_h_ref regex_h;        /* reference to internal regex.h-variables */

The fields flags, num_pattern_subexpr and pattern are related to regex_compile(). The regex_compile() function returns a RegEx object with the compiled data in the field regex_h with a return_code and an error_essage on failure.
The returned RegEx object will be executed with regex_exec() and contains the related field matches, num_matches, text and a internal values in the field regex_h.
The RegEx object can be printed with regex_print() on screen with or closed with regex_close() after comilation and execution.
regex_compile() and regex_exec() can both be used combined as regex_match().
The RegEx object must be closed with regex_close() to free the allocated memory.

RegEx subobjects:

Field: flags

The subobject regexobj->flags contains the regex option flags status.

int GLOBAL;
int EXTENDED;
int ICASE;
int MULTILINE;
int NEWLINE;
int NOSUB;           /* note: REG_NOSUB is deactivated in the program */
int NOSUBEXP;
int SUBEXP

The can be set with the Option flags on regex_compile() or regex_match().

Field: matches

The subobject regexobj->matches[i] contains the regex match result data:

int number_match;    /* number of the match */
int number_submatch; /* number of the group or submatch */
int start;           /* byte offset from string's start to substring's start. */
int end;             /* byte offset from string's start to substring's end. */
char* string;        /* string of the sub-expression match */

Field: file

The subobject regexobj->file can be connected to a RegExFile object. This can be useful for printing or other functions.

POSIX Standard

Basic Regular Syntax

Metacharacter	Description
^	Matches the starting position within the string. In line-based tools, it matches the starting position of any line.
.	Matches any single character (many applications exclude newlines, and exactly which characters are considered newlines is flavor-, character-encoding-, and platform-specific, but it is safe to assume that the line feed character is included). Within POSIX bracket expressions, the dot character matches a literal dot. For example, a.c matches "abc", etc., but [a.c] matches only "a", ".", or "c".
[ ]	A bracket expression. Matches a single character that is contained within the brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: [abcx-z] matches "a", "b", "c", "x", "y", or "z", as does [a-cx-z]. The - character is treated as a literal character if it is the last or the first (after the ^, if present) character within the brackets: [abc-], [-abc], [^-abc]. Backslash escapes are not allowed. The ] character can be included in a bracket expression if it is the first (after the ^, if present) character: []abc], [^]abc].
[^ ]	Matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than "a", "b", or "c". [^a-z] matches any single character that is not a lowercase letter from "a" to "z". Likewise, literal characters and ranges can be mixed.
$	Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line.
( )	Defines a marked subexpression, also called a capturing group, which is essential for extracting the desired part of the text (See also the next entry, \n). BRE mode requires ( ).
\n	Matches what the nth marked subexpression matched, where n is a digit from 1 to 9. This construct is defined in the POSIX standard.[36] Some tools allow referencing more than nine capturing groups. Also known as a back-reference, this feature is supported in BRE mode.
*	Matches the preceding element zero or more times. For example, abc matches "ac", "abc", "abbbc", etc. [xyz] matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on. (ab)* matches "", "ab", "abab", "ababab", and so on.
{m,n}	Matches the preceding element at least m and not more than n times. For example, a{3,5} matches only "aaa", "aaaa", and "aaaaa". This is not found in a few older instances of regexes. BRE mode requires {m,n}.

Extended Regular Syntax

Metacharacter	Description
?	Matches the preceding element zero or one time. For example, ab?c matches only "ac" or "abc".
+	Matches the preceding element one or more times. For example, ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac".
\|	The choice (also known as alternation or set union) operator matches either the expression before or the expression after the operator. For example, abc

Source: https://en.wikipedia.org/wiki/Regular_expression#POSIX_basic_and_extended

Character classes

Description	POSIX	Shortcode	ASCII
ASCII characters		\x[Bytecode]
Alphanumeric characters	[:alnum:]		[A-Za-z0-9]
Alphanumeric characters plus "_"		\w	[A-Za-z0-9_]
Non-word characters		\W	[^A-Za-z0-9_]
Alphabetic characters	[:alpha:]	\a	[A-Za-z]
Space and tab	[:space:]	\s
	[:blank:]	\t
Non-whitespace characters		\S	[^ ]
Word boundaries		\b
Non-word boundaries		\B
Digits	[:digit:]	\d	[0-9]
Non-digits		\D	[^0-9]
Lowercase letters	[:lower:]	\l	[a-z]
Uppercase letters	[:upper:]	\u	[A-Z]
Visible characters	[:print:]	\p	[\x20-\x7E]

ASCII and Unicode sequences

ASCII Codes

You can use a hexadecimal bytecode sequences to match a specific ASCII character:

RegEx regex_data = regex_compile("\x61\x62\x63", REG_DEFAULT);
regex_exec("abc 123 ABC xyz", regex_data);

This matches the ASCII characters of the "abc" of the regular expression pattern in the input text string "abc 123 ABC xyz"

Unicode

You can do the same with Unicode sequences:

RegEx regex_data = regex_compile("\u20AC|\u00b5", REG_DEFAULT);
regex_exec("! € µ ? x y z", regex_data);

This matches the special characters "€" or "µ" in the input text string "! € µ ? x y z"

Matching subexpressions

Groups (or submatches) in a regular expression can be matched in a text with the use of parentheses as subexpressions to create numbered capture groups.

Example:

The input-string: "abc 123 ABC xyz"
and the regular expression pattern: "((\w+) (\d+)) (.*)"
matches:
- the whole string abc 123 ABC xyz
- 4 groups:
  - 1: abc 123 by the subexpression ((\w+) (\d+))
  - 2: abc by the subexpression (\w+)
  - 3: 123by the subexpression (\d+)
  - 4: ABC xyz by the subexpression (.*)

It is possible to match only the subexpressions with the REG_SUBEXP option flag, see option flags.

Replace strings with regular expressions

The regex_replace() function replaces all matches of a regular expression with a substring:

Example:

char* output_string = regex_replace("Mr Black is changing his 6 strings on his Brown guitar", "black|brown", "Blue", REG_GLOBAL | REG_ICASE);

This will match all upper case and lower case words "black" or "brown" in the input text string and replace it with the word "Blue". It returns a pointer to a character array string with the modified text:
- "Mr Blue is changing his 6 strings on his Blue guitar".

Example with group references:

This will replace "ABC" with "CBA":

char* output_string = regex_replace("ABC", "(A)(B)(C)", "\\3\\2\\1", REG_DEFAULT);

Example with multiple replacements:

int option_flags = REG_GLOBAL | REG_ICASE;
char* input_string = "Mr Black is changing his 6 strings on his Brown guitar";
char* output_string_1;
char* output_string_2;

output_string_1 = regex_replace(input_string, "black|Brown", "Blue", option_flags);
output_string_2 = regex_replace(output_string_1, "guitar", "acoustic guitar", option_flags);

output_string_2 will be:
- "Mr Blue is changing his 6 string on his Blue acoustic guitar".

File reading and writing

RegExFile Object

The Object RegExFile contains fields for the regex file i/o data:

FILE* ptr;                  /* pointer to the FILE object for reading and writing*/
char* name;                 /* filename string */
char* buffer;               /* file buffer */
int status;                 /* status of file reading */
int length;                 /* file-length */

The RegExFile object can be connected to the a pointer in the RegEx object like regex_data->file.

Reading from a file

The contents of a file can be read into the buffer of the RegExFile object.

The Maximum value limiters must be changed in order to work with larger files:

/* file length is 30 KB = 30720 bytes */
MAX_TEXT_LENGTH = 30720;
/* there more than 4000 matches */
MAX_NUM_MATCHES = 4096;
/* print the first 1024 characters */
MAX_PRINT_TEXT_LENGTH = 1024;

RegExFile regex_file = regex_readfile("example-file.txt");

if (regex_file->status > 0)
{
    RegEx regex_data = regex_compile("[a-zA-Z]+ [0-9]+", REG_GLOBAL | REG_ICASE);
    regex_exec(regex_file->content, regex_data);

    if (regex_data->return_code == REGEX_MATCH_SUCCESS)
    {
        regex_data->file = regex_file;
        regex_print(regex_data, REGEX_PRINT_LIST);
        regex_writefile(regex_data, REGEX_PRINT_TABLE | REGEX_PRINT_NOTEXT | REGEX_PRINT_NOSTATS, "output_as_table.txt");
    }

    regex_closefile(regex_file);
    regex_close(regex_data);
}

Read the file into the RegExFile object. Check the status of fread(), then compile the regeular expression and execute it on the file content. If the matching is successful, print the matched results as a list and export it to a file as table without the text and regular expression statistics.
The RegExFile object must not be connected to the RegEx, but it's useful for functions like regex_print() to print related i/o data.
The RegExFile must also be closed with regex_closefile() to free the allocated memory.
The RegExFile must be set to NULL before regex_closefile(), if regex_readfile() is not called, to avoid a runtime error.

The program can handle large text files with more than 1.000.000 lines and over 100.000 matches. It is tested with larger files over 10 MB up to 100 MB. But very large text files with more than 10.000 matches can take a long time to process (more than 10 minutes).

Maximum value limiters

The maximum value limiters can be changed to larger size values:

/* Memory limiters */
unsigned int MAX_TEXT_LENGTH = 8192;
unsigned int MAX_PATTERN_LENGTH = 1024;
unsigned int MAX_NUM_MATCHES = 1024;
unsigned int MAX_PRINT_TEXT_LENGTH = 512;
unsigned int MAX_FILENAME_LENGTH = 256;

These are the default values. They can easily be changed to other sizes, if needed.

Writing to a file

regex_writefile()

The regex_writefile() function is quite similar to the regex_print() function. It can be used in the same way for printing the content into a formated text file using the Print layout.

Examples:

regex_writefile(regex_data, REGEX_PRINT_PLAIN, "output_words.txt");

Just print the results into a text file.

regex_writefile(regex_data, REGEX_PRINT_TABLE | REGEX_PRINT_NOTEXT | REGEX_PRINT_NOSTATS, "output_as_table.txt");

Print a table without the text or any additional information.

regex_writefile(regex_data, REGEX_PRINT_JSON, "output_reg_file.json");

Print the results into a JSON file.

regex_writefile_string()

The regex_writefile_string() function can be used to print a basic string into a file.

Example:

regex_writefile_string(output_string, "replaced_words.txt");

Acknowledgements

Regex (edition 0.12) - Kathryn A. Hargreaves, Karl Berry
You can read a compiled version of their documentation in the subfolder: documentation/Regex - edition 0.12a - 1992.pdf
The makers of Sublime Text for a very nice text editor software.
List Of English Words - https://github.com/dwyl/english-words

compact-regex Code

C89/ANSI-C compatible extension for the regex.h library

Branches

Tree [3b26a5] main / Download Snapshot History

Read Me

compact-regex.h

CREG

Table of contents

Project files

Library files

Example and usage files

Additional files

Compilation

Linux

Windows

MacOS

Integration into your project

How to use

Simple use

Extended use

Example program files

Supported Regular Expression operations

Program functions

regex_match()

regex_compile()

regex_exec()

regex_close()

regex_error()

regex_replace()

regex_print()

set_default_reg_flags()

regex_readfile()

regex_closefile()

regex_writefile()

regex_writefile_string()

Program options

Option flags

Default Option flags

Print layout

Additional printing layout filter

Colored terminal output

RegEx Object

POSIX Standard

Basic Regular Syntax

Extended Regular Syntax

Character classes

ASCII and Unicode sequences

ASCII Codes

Unicode

Matching subexpressions

Replace strings with regular expressions

File reading and writing

RegExFile Object

Reading from a file

Maximum value limiters

Writing to a file

regex_writefile()

regex_writefile_string()

Acknowledgements

Tree [3b26a5] main /

History