I've spent the last week hacking at an implementation of IBM XML GENERATE
. Attached is an initial, very simple patch for it. It has many, many problems:
cob_field
's data pointer - with no regard for its size. To get the correct output, we obviously need to provide the entire cob_field
and output a string of the correct length. This will require moving initialisation of the cob_xml_tree
's to after all the fields are allocated.XML GENERATE
clauses) -> (record structure annotated with XML details) -> (as before, but with the tree flattened into a list and XML elements/attributes/contents separated) -> (cob_xml_tree
and cob_xml_attr
structs in generated C code). I would like to reduce the two middle steps to one step.WHEN
and SUPPRESS
act like statements in the parser: if the parser is parsing a statement and sees one of them, it assumes it has reached a new statement. This is bad: XML GENERATE
has WHEN
and SUPPRESS
as words in clauses. The current, ugly workaround is to add new tokens WHEN_XML
and SUPPRESS_XML
which don't behave like statements.XML GENERATE
(and XML PARSE
) need yet, so some words which should be context-sensitive aren't context-sensitive.cob_field
to indicate that it contains multiple data fields (corresponding to each subscripted entry). However this problem is solved, what will quickly follow will be a fix for [bugs:#35] and an implementation of the (ALL)
subscript.Example usage:
:::cobolfree
>>SOURCE FREE
IDENTIFICATION DIVISION.
PROGRAM-ID. prog.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 x PIC X(200).
01 y.
03 z PIC X(15) VALUE "hello, world!".
03 az PIC X(15) VALUE "goodbye, world!".
03 ab.
05 abc PIC x(3) value spaces.
PROCEDURE DIVISION.
XML GENERATE x
FROM y
WITH XML-DECLARATION
NAME OF abc IS "ABCDEF", z IS "zeta"
TYPE OF z IS ATTRIBUTE
SUPPRESS WHEN SPACES
DISPLAY FUNCTION TRIM(x)
.
Output:
:::xml
<?xml version="1.0"?>
<y zeta="hello, world! goodbye, world! "><az>goodbye, world! </az><ab/></y>
What it should output:
:::xml
<?xml version="1.0"?>
<y zeta="hello, world!"><az>goodbye, world!</az></y>
Diff:
Diff:
This update has field output of the right length and resets
SUPPRESS
conditions betweenXML GENERATE
's. This patch also includes xml.c, which I forgot from the first.Example:
Improved, but still incorrect, output:
Diff:
Version 3 has a fully functional
SUPPRESS
clause, correct trimming of alphanumeric data, reserved word fixes and tests.Diff:
Version 4 adds the
XML-CODE
register, detection of invalid XML characters in COBOL names or COBOL records, and the ignoring of subrecords whch areFILLER
,REDEFINES
orRENAMES
.Last edit: Edward Hart 2018-08-05
Diff:
Related
Wish List: #425
The patch is nearly complete. Version 5 adds syntax checks, config options and completes the exception handling test.
Diff:
Hi Edward,
thank you for the patch. Some first notes:
--with-xml2
?xml2-config
is not available, first a fall-back topkg-config --cflags libxml-2.0
and in general testing if we can compile a minimal sample and use the library (if neither xml2-config nor pkg-config provide something we leave it to the user to specify necessaryC[PP]FLAGS
) - on at least one of my testing machines xml2-config is not available and there is no pkg-config entry for libxml (no version) but library and headers are availableprint_info()
functionsSimon
Last edit: Simon Sobisch 2018-08-10
Whoops. It's included in the attached updated patch.
Yes.
Done.
I've added a check for
pkg-config
and a default case where we check for the headers and-lxml2
.libxml2 comes with a
LIBXML_TEST_VERSION
macro which I think does something like that.Thank you. configure.ac contains a copy&paste issue "xml library is required as -ldb". For
AC_MSG_NOTICE([ Use libxml2 for XML I/O yes])
we should add anelse
(or directly output$COB_HAS_XML2
). Similar copy&paste issuevar_print (_("ISAM handler"), "libxml2",
.syn_misc.at contains
AT_SKIP_IF([test "$COB_HAS_XML2" = "no"])
, shouldn't this be necessary only for run_xml.at?As XML PARSE doesn't have much clauses, can you please add it to parser.y with a PENDING notice?
I'd like to compile and run the tests next week on some machines to verify that this will work on more envrionments, afterwards I'd say it is time to commit it. What do you think about this?
Last edit: Simon Sobisch 2018-08-11
Version 7 fixes the mistakes in copying-and-pasting, fixes memory leaks and includes a refactoring of all the cobc changes.
All that remains (for me, at least, to do):
OCCURS
and floating-point items.build_windows.h.in
)XML PARSE
to parser.y.NEWS
entry.Last edit: Edward Hart 2018-08-12
Diff:
Related
Wish List: #425
Version 8 includes changes for Visual C++: replaced
free()
in xml.c withxmlFree()
(fixing a heap exception which did not occur on Ubuntu) and updated build_windows/config.h.in and build_windows/README.txt. NB: This patch includes some unintentional changes which will be removed in version 9 (e.g. changing VBISAM to BDB in config.h.in and changes in the .sln files); they are only included because I can't be bothered to remove them tonight.Committed in [r2688].
Thank you very much for this marvelous patch.
It inspired me to go a little bit deeper on some details (mainly build envrionment), so I took the time to do some minor code adjustments (minimal URI check without lib2xml, compiler warnings), build_windows adjustments and tweaked configure.
The last one actually brings in some changes:
--with-xml
will enforce it and abort if not possible to compile+link it) ; the reasons for this:pkg-config
via m4 macro, this brings the benefits that it is more compatible for cross-compilation this way, we get two nice variables to override thex_CFLAGS
andx_LIBS
(both specified =pkg-config
not called at all)pkg-config
gets priority overxml2-config
as it is available on more machines and will be used for other libraries in the future (it doesn't work fine for curses, bdb or vbisam :-( )x_CFLAGS
/x_LIBS
to thexml2-config
partI just used "XML2" as internal prefix for
pkg-config
, but we may change it toXML
. Thoughts?Do you have any plans on working on the following related parts?
XML PARSE
OCCURS
entries as you've outlined aboveThank you for improving my configure changes.
Which ones? I quick Google search suggests it's available on Solaris, HP-UX and z/OS, which I thought were the most "exotic" systems we support.
I've only tested my changes with libxml2, so I'd be surprised if it works with libxml1, which, in any case, is no longer widely available. There's no standard XML API, so my changes won't work with other XML libraries either. So I think the prefix
XML2
is best.I have no plans currently. I will note that:
XML PARSE
will require me to get familiar with libxml2 parsing's APIs. I'll probably have to get advice from the libxml mailing list before I know where to start.OCCURS
is the most important thing to fix.JSON GENERATE
will be very easy to implement now.Last edit: Edward Hart 2018-08-21
The headers and libraries are available but may not be pre-installed (it isn't hard to install them, but an additional thing that may also break automated build systems as long as the additional dependency is not in).
And there are cases like the machine I currently
ssh
to (SLES9):... note: I just recognized that ther are some parts to adjust as the changed configure parts don't work on this machine... Will try to fix it this evening... EDIT: sometime this week.
:-) Don't do stuff like this in production, it may magically link :-)
Solaris:
Note: The "XML GENERATE syntax checks" test fails when tree-debug is active, can you inspect this, please?
Last edit: Simon Sobisch 2018-08-20
Hm...
PLEASE add
JSON GENERATE
and (parsing only)JSON PARSE
:-)As this will get in more "registers" I actually think we also should adjust the register handling within libcob.
What do you think about something like the following?
And then genereate the field reference in a local table and its pointer in the
cob_module
structure, instead of the currentcob_module->xml_code
(similar to what is done withconst char **module_sources
).The references would be set only once during
cob_module_global_enter()
.Opinions?
Ah, sorry for not replying to this sooner.
I started looking into
JSON GENERATE
, but immediately ran into the problem of deciding which JSON library to use. It's a choice betweenOf these, JSON-GLib is the only one which is part of a wider project and not just a GitHub repo. But I'm not sure how many dependencies JSON-GLib has. We need something portable, something powerful enough to support
JSON PARSE
(and maybe evenORGANIZATION JSON
) and something which will be maintained for years to come.Regarding register changes: they'd be fine to implement, but I'm not sure why we need them.
Last edit: Edward Hart 2018-09-01
I've looked a few times, and cJSON seems like a sane choice. Knows a little bit about UTF-16 encoding, which is a step in the right direction for future NATIONAL UCS-2/other support. cloc's in at half the lines of code as say jansson, and scores 99% on conformance test linked below. But it's not in Debian, that I can tell. Which may mean it's in less major distro channels than some of the other choices.
To help make the decision, this page has conformance and performance benchmarks on most of the C implementations, Edward.
https://github.com/miloyip/nativejson-benchmark