Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
cs4(lexer).cs | 2018-02-05 | 1.8 kB | |
cs4(parser).cs | 2018-02-05 | 19.4 kB | |
Totals: 2 Items | 21.2 kB | 0 |
Download the "CurrentVersion" folder if you need only the library. The frontend contains all the necessary documentation.
CodeConics
CodeConics is simple, intuitive, easy to use and debug code-manipulation solution, that is grammar independent and recursion based, as opposed to other DFA(Deterministic Finite-state Automata) based solutions.
CodeConics is a C# AST/lexer/parser generation library - a tool that takes some grammar as an input and emits a lexical and syntactical analyzers source code. Very much like "lex"/"yacc".
Unlike "lex"/"yacc" however, this one emits C# source code instead of C++. Also, the analyzers are not automaton based so it is not such a nightmare to debug anymore, and don't require grammar rewriting and restrictions - almost any working grammar will do.
This, however, and the fact that the algorithm currently used is not optimal incurs a considerable speed penalty when parsing larger files with larger grammars, so this must be considered.
It is still very good for small domain-specific languages or partial grammars though, for example, if a parser is needed to parse a code snippet so it can be manipulated later.
The idea behind CodeConics is to be easy and more intuitive to use than any other current compiler-compiler software by implementing some additional features and keeping the code managed and debuggable, while adding additional features.
Why CodeConics?
Being able to parse various source codes is extremely useful in complex web automation software, where values have to be extracted from scripts.
It is also imperative in tasks that require random code generation like obfuscation and randomization of scripts and managed code, as well as automation of reverse-engineering tasks like code injection into executable files.
There are 3 major elements in manipulating source code in OOP fashion - the Abstract Syntax Tree (or AST), the lexer, and the parser.
However, there is no good AST library for .Net, as long as I'm aware of. There is NRefactory that is very complex and largely undocumented and there is the Roslyn framework, which is not bad but is also unnecessarily complex. There is also the incomplete and useless CodeDom. Also, these all facilitate only .Net code and are useless for other languages like JavaScript, Java, C++ for example.
As for lexers and parsers - those mostly come in a form of source code generated by tools from grammar files. Most of such generators are native - written in C++, or wrappers around native tools that are hard to debug. Most of these tools derive from the infamous LEXX and YACC tools. Such tools create automaton(DFA) based lexers and parsers that makes debugging next to impossible and use complex grammars that make complete and very large languages like C++, but are hard to write.
"CodeConics.Units" contains a set of universal AST classes that is well documented, lightweight and very easy to use to automate ANY programming language. It is still in beta and lacks prober unitTests for a part of the code but is well documented and really easy to use.
"CodeConics.Lexers" contains a lexer built on top of the .Net framework regex engine. It is a bit slower then automaton based one, but is really easy to debug, grammars can be built on the fly, switched and serialized as needed. It is also in beta stage and lacks XML documentation but is easy and intuitive to use.
"CodeConics.Parsers" contains the parser. It is recursion based and also intended to be easy to use and debug. It is a poor choice for developing full-blown parsers and compilers for large languages, but is ideal for small domain-specific languages and in cases where a limited part of a language must be facilitated. However, the parser is still a bit undeveloped, so it is advised not to use it, except for the simplest tasks, and integrate/interop different parser engine if required. You can take a look at "Original GOLD Engine - Calitha C# Engine" : http://www.goldparser.org/engine/1/net/van-loenhout/index.htm if you need this. The parser can still be used, but caution is advised - Sorry but this is the current situation :)
Also, there is a fair amount of research papers and resources provided in the files. All the provided sample source codes in different languages are mine and/or licensed under the MIT license. Some of the papers and ebooks, however, might be subjects of copyright laws under different license terms and agreements than the MIT license, under which this software is licensed. I might be in violation of some of those, and if so I can be held responsible for this. I don't care much as I believe knowledge should be free of charge, but just have this in mind.
- The author.