Like the EDIFACT lexer I did two and a half years ago, I've also done an X12 parser.
X12 is a US originating data interchange format, with lots of specifications for different purposes. But they all share the same syntax.
Read here for more information on the format:
http://www.rawlinsecconsulting.com/x12tutorial/x12syn.html
The only property is fold on/off.
The skeleton for the lexer came from the EDIFACT lexer, so it should be pretty Scintilla code formatted already, as I did quite a few changes there to keep Neil happy. (As he should be!)
In addition to the CXX file, I had to add
LINK_LEXER(lmX12);
to Catalog.cxx,
and to SciLexer.h
#define SCLEX_X12 128 ... #define SCE_X12_DEFAULT 0 #define SCE_X12_BAD 1 #define SCE_X12_ENVELOPE 2 #define SCE_X12_FUNCTIONGROUP 3 #define SCE_X12_TRANSACTIONSET 4 #define SCE_X12_SEGMENTHEADER 5 #define SCE_X12_SEGMENTEND 6 #define SCE_X12_SEP_ELEMENT 7 #define SCE_X12_SEP_SUBELEMENT 8
That's it!
Here's some sample x12 to test with:
ISA*00* *00* *01*0011223456 *01*999999999 *950120*0147*U*00300*000000005*0*P*~ GS*PO*0011223456*999999999*950120*0147*5*X*003040 ST*850*000000001 BEG*00*SA*95018017***950118 N1*SE*UNIVERSAL WIDGETS N3*375 PLYMOUTH PARK*SUITE 205 N4*IRVING*TX*75061 N1*ST*JIT MANUFACTURING N3*BUILDING 3B*2001 ENTERPRISE PARK N4*JUAREZ*CH**MEX N1*AK*JIT MANUFACTURING N3*400 INDUSTRIAL PARKWAY N4*INDUSTRIAL AIRPORT*KS*66030 N1*BT*JIT MANUFACTURING N2*ACCOUNTS PAYABLE DEPARTMENT N3*400 INDUSTRIAL PARKWAY N4*INDUSTRIAL AIRPORT*KS*66030 PO1*001*4*EA*330*TE*IN*525*VN*X357-W2 PID*F****HIGH PERFORMANCE WIDGET SCH*4*EA****002*950322 CTT*1*1 SE*20*000000001 GE*1*5 IEA*1*000000005
This was coded in VS2017 on Windows, but there's no fancy C++ features being used, so it should be pretty safely cross platform.
Diff:
I added some ~~~ code quoting to the above description as Markdown uses * and _ as formatting.
The preferred way to update Scintilla with a new lexer is to change the Scintilla.iface file then run scripts/LexGen.py as that updates SciLexer.h, adds the lexer to Catalogue.cxx, and adds the file to all the make and project files.
There are some minor pieces of dead code which may be incomplete changes or just inefficiencies. The condition on line 140
if (posCurrent == posFinish)
can never be true as the loop header iswhile (posCurrent < posFinish)
andposCurrent
hasn't changed. Perhaps the code was changed to produce T instead of modifying posCurrent and T should be involved in the test.On line 130
posSegmentStart
is declared and set. It is later changed but is never read so appears redundant. From g++:It can be worthwhile running other compilers like g++, or static analyzers like cppcheck which reports:
The results of static checkers aren't always valid or interesting but may indicate that the code evolved without updating all consequences.
Hello Neil,
I can't seem to log in to sourceforge since Friday (I have emailed them), but have updated the Lexer CPP file. If you're impatient, it's attached, otherwise wait for a few days!
Regards,
Iain.
Ps, cppcheck is happy with it now, and I've redone the colouring to work more nicely with malformed X12 data.
From: Neil Hodgson [mailto:nyamatongwe@users.sourceforge.net]
Sent: 27 April 2019 00:34
To: [scintilla:feature-requests] 1280@feature-requests.scintilla.p.re.sourceforge.net
Subject: [scintilla:feature-requests] #1280 X12 Lexer
I added some ~~~ code quoting to the above description as Markdown uses * and _ as formatting.
The preferred way to update Scintilla with a new lexer is to change the Scintilla.iface file then run scripts/LexGen.py as that updates SciLexer.h, adds the lexer to Catalogue.cxx, and adds the file to all the make and project files.
There are some minor pieces of dead code which may be incomplete changes or just inefficiencies. The condition on line 140 if (posCurrent == posFinish) can never be true as the loop header is while (posCurrent < posFinish) and posCurrent hasn't changed. Perhaps the code was changed to produce T instead of modifying posCurrent and T should be involved in the test.
On line 130 posSegmentStart is declared and set. It is later changed but is never read so appears redundant. From g++:
../lexers/LexX12.cxx: In member function 'virtual void LexerX12::Lex(Sci_PositionU, Sci_Position, int, Scintilla::IDocument*)':
../lexers/LexX12.cxx:130:16: warning: variable 'posSegmentStart' set but not used [-Wunused-but-set-variable]
Sci_PositionU posSegmentStart = 0;
It can be worthwhile running other compilers like g++, or static analyzers like cppcheckhttp://cppcheck.net/ which reports:
scintilla\lexers\LexX12.cxx:140:18: warning: Opposite inner 'if' condition leads to a dead code block. [oppositeInnerCondition]
if (posCurrent == posFinish)
scintilla\lexers\LexX12.cxx:132:20: note: outer condition: posCurrent<posFinish
while (posCurrent < posFinish)
scintilla\lexers\LexX12.cxx:140:18: note: opposite inner condition: posCurrent==posFinish
if (posCurrent == posFinish)
scintilla\lexers\LexX12.cxx:130:16: warning: The scope of the variable 'posSegmentStart' can be reduced. [variableScope]
Sci_PositionU posSegmentStart = 0;
scintilla\lexers\LexX12.cxx:130:0: warning: Variable 'posSegmentStart' is assigned a value that is never used. [unreadVariable]
Sci_PositionU posSegmentStart = 0;
^
scintilla\lexers\LexX12.cxx:134:19: warning: Variable 'posSegmentStart' is assigned a value that is never used. [unreadVariable]
posSegmentStart = posCurrent;
scintilla\lexers\LexX12.cxx:162:20: warning: Variable 'posSegmentStart' is assigned a value that is never used. [unreadVariable]
posSegmentStart = T.pos + T.length;
scintilla\lexers\LexX12.cxx:189:0: warning: Variable 'indentNext' is assigned a value that is never used. [unreadVariable]
int indentNext = indentCurrent;
^
The results of static checkers aren't always valid or interesting but may indicate that the code evolved without updating all consequences.
[feature-requests:#1280]https://sourceforge.net/p/scintilla/feature-requests/1280/ X12 Lexer
Status: open
Group: Completed
Created: Fri Apr 26, 2019 06:43 PM UTC by Iain Clarke
Last Updated: Fri Apr 26, 2019 09:58 PM UTC
Owner: nobody
Attachments:
Like the EDIFACT lexer I did two and a half years ago, I've also done an X12 parser.
X12 is a US originating data interchange format, with lots of specifications for different purposes. But they all share the same syntax.
Read here for more information on the format:
http://www.rawlinsecconsulting.com/x12tutorial/x12syn.html
The only property is fold on/off.
The skeleton for the lexer came from the EDIFACT lexer, so it should be pretty Scintilla code formatted already, as I did quite a few changes there to keep Neil happy. (As he should be!)
In addition to the CXX file, I had to add
LINK_LEXER(lmX12);
to Catalog.cxx,
and to SciLexer.h
define SCLEX_X12 128
...
define SCE_X12_DEFAULT 0
define SCE_X12_BAD 1
define SCE_X12_ENVELOPE 2
define SCE_X12_FUNCTIONGROUP 3
define SCE_X12_TRANSACTIONSET 4
define SCE_X12_SEGMENTHEADER 5
define SCE_X12_SEGMENTEND 6
define SCE_X12_SEP_ELEMENT 7
define SCE_X12_SEP_SUBELEMENT 8
That's it!
Here's some sample x12 to test with:
ISA00 00 010011223456 01999999999 9501200147U003000000000050P~
GSPO001122345699999999995012001475X003040
ST850000000001
BEG00SA95018017**950118
N1SEUNIVERSAL WIDGETS
N3375 PLYMOUTH PARKSUITE 205
N4IRVINGTX*75061
N1STJIT MANUFACTURING
N3BUILDING 3B2001 ENTERPRISE PARK
N4JUAREZCH**MEX
N1AKJIT MANUFACTURING
N3*400 INDUSTRIAL PARKWAY
N4INDUSTRIAL AIRPORTKS*66030
N1BTJIT MANUFACTURING
N2*ACCOUNTS PAYABLE DEPARTMENT
N3*400 INDUSTRIAL PARKWAY
N4INDUSTRIAL AIRPORTKS*66030
PO10014EA330TEIN525VN*X357-W2
PIDF***HIGH PERFORMANCE WIDGET
SCH4EA***002950322
CTT11
SE20000000001
GE15
IEA1000000005
This was coded in VS2017 on Windows, but there's no fancy C++ features being used, so it should be pretty safely cross platform.
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/scintilla/feature-requests/1280/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Related
Feature Requests:
#1280Committed as [a99fa0].
Related
Commit: [a99fa0]