Commit GNU-COBOL fileio re-engineered

  • Joe Robbins

    Joe Robbins - 2014-08-05

    Notes on the recent commit of GNU-COBOL fileio re-engineered.

    On 2 Aug 2014 I committed the files that constitute a new version of GNU COBOL fileio, together with those other files that have been edited to match the new fileio. Although some new features are supported, the main rationale for these changes was to atomise fileio so that it was more easily understood. The new structure will help programmers in future to change parts of fileio without inadvertently "breaking" other parts. Which code is in/excluded for any particular configuration should now be much clearer.

    Here's a recap on the revised configuration options for fileio (default given first):

    --with-relative = yes | no            include code for ORGANIZATION RELATIVE
    --with-sequential = yes | no          include code for ORGANIZATION (LINE) SEQUENTIAL
    --with-indexed = bdb | bdb1 | vbisam | disam | cisam | odbc
                                              implement ORGANIZATION INDEXED using nominated technology
    --with-file-status = ext[ended] | ansi85 | native | rm 
                                              basis for FILE STATUS values
    --enable-relative-fixed [no]          ORGANIZATION RELATIVE file's are variable or fixed rsz
    --enable-fileio-trace [no]            include code to trace fileio processing
    --enable-fileio-stats [no]            include code to produce fileio statistics
    --enable-fileio-log [no]              include code to log fileio activity
    --enable-fileio-sharing [no]          include code to support FILE SHARING
    --enable-fileio-exchange [no]         include code to interface to a user-supplied file-handler
    --enable-odbc-dynamic-connection [no] ODBC INDEXED : parse file names to extract SERVER and DB
    --with-varseq = 0 | 1 | 2 | 3         determines format of SEQUENTIAL variable length records

    In the directory ~/tests/testsuite.src the files and contain COBOL programs (distilled to the bare essentials) that demonstrate some of the features of fileio. Suggestions for new test-cases always welcome!

    Most testing has been done on LINUX platforms. More development and testing is required on MinGW.

    1. --with-indexed = bdb This imposes a restriction on the version of BDB libraries that you may use: use BDB 4.1 (or later). (Tests have been done using BDB 5.3.21 and BDB 4.1).
      Use --with-indexed = bdb1 for earlier versions of BDB. You will need to copy/link the relevant .h file - they had a different name on each BDB release - to "db1.h" (in /usr/include). bdb1 configuration has had only superficial testing.
    2. SPLIT-KEYS tested in all INDEXED implementations.
    3. RECORD-LOCKING is still being worked on.
    4. For ORGANIZATION INDEXED, most testing has been done using BDB or VBISAM. Further testing is required using CISAM and DISAM.
    5. FILE SHARING (whole file locking / EXCLUSIVE) tested on LINUX.
    6. For ORGANIZATION INDEXED, flexibility with keys is provided. Not all secondary keys need to be declared in the COBOL programs all the time. Records written to the file will still be indexed (as though all the secondary keys had been declared). VBISAM etc effectively maintains a "dictionary" of all keys declared when the file was first created; so this works intrisically. BDB has no "understanding" of how keys are being extracted from the record-data; to support flexibility new fileio creates a "key dictionary" for each BDB managed file. If the dictionary is "lost" it is re-created when the file is next opened - preferably with ALL the COBOL keys declared.
    7. When fileio throws an exception, it can now report the errno from the underlying code, along with a matching english text. It also reports internal, external and fully pathed file-names. This removes ambiguities when COBOL source assigns names dynamically to a file; reports like "file1 status 30" were little help when diagnosing a problem. The usual rules apply: if you don't code a USE PROCDURE for the file in DECLARATIVES SECTION, declare a FILE-STATUS, or use AT END / INVALID KEY, the default exception-handler kicks-in.
      When file-status-1 = "9", fileio checks --with-file-status. If "ansi85" the file-status value stands. If "rm" (Ryan McFarland) file-status "90" thru "99" are used. If "native" and the exception originated below fileio (e.g. LINUX, BDB, VBISAM) file-status-1 is switched to "8" and the error-code returned from the underlying system is placed in file-status-2.
      COBOL code can force the exception-handler to disgorge the last exception encountered by using CALL "CBL_REPORT_IO_EXCEPTION". Here's an example:
           procedure division.
               open input file1.
               write file1-rec.
               call "CBL_REPORT_IO_EXCEPTION".
               close file1.
               stop run.

    Produces this display:

    libcob: Exception encountered while processing FILE: file1 :: /var/bawtry/elect_8.00/data/file1 :: /var/bawtry/elect_8.00/data/file1
    ....... STATUS-KEY:       48 [WRITE on file not opened for output]
    ....... COBOL RTS ERRNO:  33 [Attempt to write to a file not opened for output]
    ....... File I-O :: COBOL RTS :: write-record :: Recoverable

    If the program had NOT declared FILE STATUS on file1, the message would have been displayed after executing "write file1-rec" - with no need for "call "CBL_REPORT_IO_EXCEPTION"".

    This process is controlled by files libcob/fileio-status.def and libcob/fileio-exception.def. (Scope to internationalize these files???)


    1. Do not use --enable-fileio-sharing. (A substitute for "fcntl" is needed.)
    2. Do not use --enable-fileio-log. (A substitute for "syslog" is needed.)

    I guess this is a huge upheaval for GNU-COBOL. But we can have some confidence in the code because it passes COBOL85 test-suite and tests/ I hope that the benefits of the new code will make the extra work worthwhile. I will treat any trouble reports related to fileio with urgency if this enables the new fileio to get released.

    There is a danger that the code has "slipped backwards": that fixes made to "old" fileio have come unstuck. This is particularly relevant since the original code base for "new" fileio was the earlier OpenCOBOL. If anyone is aware of recent fault reports (say last 2 years!) related to fileio, please let me know and I will ensure new fileio honours them. (Is there any way of searching the archive at Sourceforge for this?)

    Anyone wanting to see how fileio works for their COBOL program should ./configure --enable-fileio-trace and run programs using: COB_FILEIO_TRACE=9 prog1 2>prog1-trace. The trace is written to stderr.


    1. Test & fix record locking.
    2. MinGw syslog.
    3. MinGW fcntl.
    4. Sometimes BDB appears to "hang" on MinGW when running COBOL85 tests (like just writing 500 records to a file hangs on some "arbitrary" record). Definitely not looping - just a hang.
    5. There is an "issue" I am analysing with BDB ... On OPEN OUTPUT the old file(s) are removed. This is done via BDB dbremove(). Unfortunately BDB seems to open the file before removing it. Frequently on running tests you get this situation: prog1.cbl uses FILE file1 ORGANISATION RELATIVE. prog2 uses FILE file1 ORGANISATION INDEXED. BDB "coughs" if the first file1 from prog1 is still on disk when prog2 is run.
    libcob: Exception encountered while processing FILE: file1 :: ./file1 :: ./file1
    ....... STATUS-KEY:       9/082 [Extended status-key: status-key-2 stores COBOL RTS error number]
    ....... COBOL RTS ERRNO:  82 [Failed to delete existing index files]
    ....... NATIVE ERRNO:     22 [Invalid argument]
    ....... INDEXED File I-O :: BDB Package :: open-file :: Fatal

    file1 has to be removed manually. On successive runs, the program works; i.e. file1 is removed silently as expected.

    Joe Robbins

    Last edit: Simon Sobisch 2014-09-05
    • Vincent (Bryan) Coen


      On files with access LINE SEQUENTIAL does it place a LF/CR or LF at end
      of line depending on platform?

      OC v1.1 seems to create a blank line between all on Linux possibly
      because it is sending windows/dos line termination.


      Last edit: Simon Sobisch 2014-08-06
      • Joe Robbins

        Joe Robbins - 2014-08-07

        Hello Vince
        fileio-sequential.c when writing LINE SEQUENTIAL files, executes "...write_opt()" either BEFORE or AFTER writing the text - depending on COBOL source: WRITE BEFORE / AFTER.
        "...write_opt() uses putc('\n') for newlines and putc('\f') for form-feeds. How '\n' is mapped to 0x0a or 0x0d0a is up to the libraries. Using MinGW (on WINDOWS) it maps to 0x0a. This makes sense because MinGW is attempting to present a GNU "face".

        There is test case in testsuite.src/ which was carried from earlier versions and reads:

        ## OPEN + WRITE + CLOSE with external check
        ## ------------------------------------------------------------------
        AT_SETUP([File ORG:LINE-SEQ ACC:SEQ write])
        AT_DATA([prog.cob], [
               PROGRAM-ID.      prog.
               ENVIRONMENT      DIVISION.
               INPUT-OUTPUT     SECTION.
               SELECT TEST-FILE ASSIGN       "./TEST-FILE"
                                ORGANIZATION IS LINE SEQUENTIAL.
               DATA             DIVISION.
               FILE             SECTION.
               FD TEST-FILE.
               01 TEST-REC      PIC X(4).
               PROCEDURE        DIVISION.
                   OPEN OUTPUT TEST-FILE.
                   MOVE "a"    TO TEST-REC.
                   WRITE TEST-REC
                   MOVE "ab"   TO TEST-REC.
                   WRITE TEST-REC AFTER 1 LINES
                   MOVE "abc"  TO TEST-REC.
                   WRITE TEST-REC BEFORE 2 LINES
                   MOVE "abcd" TO TEST-REC.
                   WRITE TEST-REC
                   CLOSE TEST-FILE.
                   STOP RUN.
        AT_CHECK([$COMPILE -o prog prog.cob], [0], , [])
        AT_CHECK([cat TEST-FILE], [0],

        This test passes on both LINUX and MinGW. If we hex-dump the output ("TEST-FILE") it is the same on BOTH LINUX and Windows:

        [root@lima tmp]# bcs_hd <TEST-FILE
        000000 61 0a 0a 61 62 61 62 63 0a 0a 61 62 63 64 0a     >a..ababc..abcd.<
        [root@lima tmp]# bcs_hd </tokyo/joe/Temp/TEST-FILE
        000000 61 0a 0a 61 62 61 62 63 0a 0a 61 62 63 64 0a     >a..ababc..abcd.<

        I am not sure if GNU-COBOL needs to go any further on this. Most printers, comms, etc can be configured to take either/both 0x0A (LF) and 0x0D (CR). Even Wordpad recognises both styles. Here, if a special need arises we pipe the text through a filter - but that is on LINUX.

        Regards: Joe Robbins

        Last edit: Simon Sobisch 2014-08-16
        • Vincent (Bryan) Coen

          Hi Joe;

          Your test results (excluding the jump from after,before shows that you
          are writing out two LF chars (linefeeds) after the first 'a' (H'61')
          followed by ababc then two LF.

          Looking at the Cobol code running on a PC (not a IBM m/f as that is
          different by having a 1st char for print control) then under Linux you
          shoud be showing (at least according to me) :

          0a 61 0a 61 62 61 62 63 0a 0a 0a 61 62 63 64 0a (at file close?)

          Clearly if running under Windows for each 0a changes to 0a0d

          See the manual which shows for the WRITE verb:

          1. Both of these file types will use an end-of-record delimiter
            character sequence to signify where one record ends
            and the next record begins. This delimiter sequence may be any of the

          a. A line-terminator sequence consisting of an ASCII
          carriage-return/line-feed character sequence (X’0D0A’) if
          you are running a MinGW or native Windows build of GNU COBOL

          b. A line-terminator sequence consisting of an ASCII line-feed character
          (X’0A’) if you are running a Cygwin,
          Linix, Unix or OSX build of GNU COBOL

          c. An ASCII formfeed character

          1. If no ADVANCING clause is specified on a WRITE to an ORGANIZATION
            LINE SEQUENTIAL file, BEFORE
            ADVANCING 1 LINE will be assumed.

          2. If no ADVANCING clause is specified on a WRITE to a LINE ADVANCING
            file, AFTER ADVANCING 1 LINE will be


          Last edit: Simon Sobisch 2014-08-07
          • Joe Robbins

            Joe Robbins - 2014-08-07

            Hello Vince

            I picked the example from testsuite.src because it had been written not by me. It was lodged specifically to check that when COBOL changes from DEFAULT to AFTER to BEFORE and back to DEFAULT the correct number of line-feeds are written. I guess in the context of this discussion it was over-complicated.

            I will run some simpler test on OpenCOBOL, new GNU-COBOL and MF-COBOL and continue this discussion under a new posting.

            But just food for thought: given your quote PARA 1. from the manual that "WRITE BEFORE ADVANCING 1 LINE" is the default for LINE SEQUENTIAL, why does your output begin with a LF?

            Regards: Joe Robbins

            • Vincent (Bryan) Coen

              Nuts I was looking at option 9 and should have been option 8.

              Yes you are right.


              Last edit: Simon Sobisch 2014-08-16
        • Vincent (Bryan) Coen

          Forgot to add:

          I have a HP 8600 Plus inkjet all -in-one.

          There is NO control over what a LF or CR does but clearly it works with
          LF only.
          As far as I can work out the CR is just ignored. On a matrix printer
          the CR was significant but on inkjets and lasers it does not seem to do


        • Simon Sobisch

          Simon Sobisch - 2014-08-07

          fileio should always use the native EOL.

          For usage within the test suite (and for personal needs if you want to always have x'0a') fileio has COB_UNIX_LF, set to YES in atlocal. This behaviour shouldn't be lost ;-)


          • Vincent (Bryan) Coen

            Native EOL for Linux is LF
            Likewise for unix.


            On 07/08/14 14:42, Simon Sobisch wrote:

            fileio should always use the native EOL.

            For usage within the test suite (and for personal needs if you want to
            always have x'0a') fileio has COB_UNIX_LF, set to YES in atlocal. This
            behaviour shouldn't be lost ;-)


    • Simon Sobisch

      Simon Sobisch - 2014-08-07

      Latest revision compiles but the configuration isn't updated:

      ./configure --with-indexed=vbisam
      configure: WARNING: unrecognized options: --with-indexed

      Please check as you should change and regenerate configure by calling

      autoreconf -I m4

      If all is done correctly the configure options will be with ./configure --help, too.
      I can help you with autoconf if you need help.


      BTW: Please add a configuration error if the old options --with-vbisam, --with-cisam, ... were used. Something like

      AC_MSG_ERROR([Obsolete option --with-vbisam used, use --with-indexed=vbisam instead])
      • Joe Robbins

        Joe Robbins - 2014-08-07

        Hello Simon

        My SVN shows that my local was committed 18 JUNE 2014 and there is nothing new to commit. I kinda assumed that you would want to run it from the top starting with autoreconf -I m4.

        I'll add the "obsolete" warnings.

        Regards: Joe Robbins

        P.S. I don't receive an email when folks post here but I do get notified of each commit. I must re-check my Sourceforge options. This explains why I don't reply sooner. Thanks for your patience!

        • Simon Sobisch

          Simon Sobisch - 2014-08-07

          I see.

          If possible (the version numbers of autoconf/automake are the same as the versions previous used) configure (and, ...) should be always committed together with the sources. Same for Flex and Bison generated files. If your version numbers match please do so, elsewise I'll regenerate all files in fileio-rw-branch tomorrow and commit them afterwards.


          BTW: The messages from SourceForge you're talking about are available in each forum. You can subscribe either to specific topics (use check-boxes and click on the "Update email subscriptions") or the whole forum (must be done for all of the three forums you want to subscribe to).

          • Simon Sobisch

            Simon Sobisch - 2014-08-08

            All files in fileio-rw-branch are regenerated and committed in [r407]. There were some inconsistencies (mostly missing changes to config-files, syn/ not referenced) that were fixed on the way, together with some merging from 2.0-branch.

            I've (re-)added the BDB parts that were missing from 2.0-branch to configure (check version numbers between header/library, check both -ldb and -ldbn.n [n.n being the version number seen in the header]).

            Now everyone should be able to do a checkout and start using this version.

            Issues that I've seen so far (when doing a default configure without any options):

            • the testsuite errors with different "wrong file status 00"
            • every compilation seems to lead to BDB home files (likely a wrong/to early initialization); this likely can be fixed during review 2.0-branch (r266)/fileio-branch (HEAD): fileio.c <-> fileio.c, while fileio.c <-> fileio-*.c

            There are still lots of revisions to merge from 2.0-branch (currently 39) which will be done by me, but as many of them are fileio-related we need the review of fileio.c <-> fileio*.c which should be done by Joe before.




            Commit: [r407]

            • Simon Sobisch

              Simon Sobisch - 2014-08-08

              I've started to investigate the parser changes and did some tweaks (including bringing back some stuff that works in 2.0-branch). There's one big point open: SPLIT KEY support for primary keys (test cases are already in).

              And another one: cobc/Changelog has no entries for fileio-rewrite, yet.

              @Joe: Please try to take care for both as I try to add SPLIT KEY support and SUPPRESS WHEN to 2.0 before rc (I hope that the Changelog entries will help me to see the necessary parts for including it).


              • Vincent (Bryan) Coen

                On another note regarding compiler (cobc) changes:

                The compiler does not produce a clean listing which includes copy libraries and comments.

                A. The compiler does not action the processes to do above in the right order such as:

                1. Read source
                2. Action all compiler directives when found
                3. Action COPY library processing updating the source statements read
                4. Using these previous 2 steps produce a listing if requested and pass updated source for next stage processing which includes expanding fixed format to free, remove unneeded white spaces that are not in
                  literals so that only one space is between words.
                  ... and continue to parse and all other phases.

                B. At the moment as we all know it processes A1,A3 then A2 which makes any listing including that which is passed to cobxref via the -Xref flag incomplete as a support document because it is hard to read and follow logic flow and all comments are lost.

                Can I suggest that someone has a look at the code in cobc to see if it is possible to change it to work in a more sensible order such as highlighted in (A).

                As anyone who has used or tried to use cobxref, at Roger request the code that processed COPY libraries, basic compiler directives and then produced a listing prior to continuing with the rest of the XREF
                processing was changed to accept as input the source programs. I file which is the sample layout as shown in (B).

                I am seriously debating returning the code back to process COPY libraries to a depth of 9 along with the listing in a similar way to the printcbl program.

                It does mean adding a largish block of Cobol code to cobxref but hopefully will get rid of some of the odd errors it is now producing although there still might be the odd one outstanding :)

                Cobxref produces a XREF listing which is broken down showing where each variable is declared e. g., each WS Section as well as declaratives in special names etc.

                A new extra will be showing the 01 record layouts defined in both File Section and Working Storage BUT will not show the usage of slack bytes if used by the compiler. I cannot see how to do this based on Indian used and size etc. (I am sure there are other variables to it that I can't think of at the moment).

                Comments anyone?


                Last edit: Simon Sobisch 2014-08-16
                • Simon Sobisch

                  Simon Sobisch - 2014-08-16

                  Hi Vince,

                  as this topic is only about the fileio-rewrite the post is quite off-topic. I've opened a ticket (#32) about the listing for discussion and volunteering. I suggest to edit your post, moving everything that's covers this to this ticket, direct link.


    • Simon Sobisch

      Simon Sobisch - 2014-08-16

      I've just thought about the MinGW points you've mentioned.

      1. syslog --> the usage of syslog() is an interesting idea, but if we add syslog (not sure about this) it should be optional. Please change the current use of syslog to fprintf (cob_filetrace_file, ...) and implement it like cob_trace_file (see common.c). Thank you!
        If we add syslog later on it should be used for both traces and runtime errors/warnings, too.

      2. fileio-sharing should be enabled by default as it was with OpenCOBOL before (at least if fclnt is available which is checked during configure). Doesn't it work in MinGW?

      3. BDB hanging - this likely depends on the question if DB_HOME is set or not. It looks like you call db_env_create no matter if it's set or not, which is quite different than old fileio. Please check this for everything that has to do with BDB.


  • Simon Sobisch

    Simon Sobisch - 2014-08-06

    Hi Joe, I guess this means the primary work is done already?

    Please change fileio-status.def and fileio-exception.def and mark the strings as translatable. Sample:

    -COB_STATUS (00, COB_STATUS_00_SUCCESS, "Successful")
    +COB_STATUS (00, COB_STATUS_00_SUCCESS, _("Successful"))

    And as the current svn rev is not compilable please do

    svn add fileio-exception.def
    svn commit -m "Add missing fileio exception definition"

    As soon as the file is in I'll merge the changes done in 2.0 branch.

    Thank you for your efforts,

    • Joe Robbins

      Joe Robbins - 2014-08-07

      Hello Simon
      Files fileio-status.def and fileio-exception.def edited for _("str") and committed.
      Sorry for omission of fileio-exception.def; I "ticked-off" exception.def - forgetting these are 2 separate files.

      Regards: Joe Robbins

  • Simon Sobisch

    Simon Sobisch - 2014-08-06

    One of 2.x changes from Roger that seems to be missing are the functions cob_fileio_getenv, cob_chk_file_env, cob_chk_file_mapping. I think this wasn't by intend but was last when committing the old fileio?

    Please diff revision 266 from 2.0 branch against HEAD from fileio branch. At least for fileio.c <-> fileio.c, while fileio.c <-> fileio-*.c should be done, too.

    Another question is if we want to have all the fileio*.h files (Roger included fileio.h, which was split by you into different header files, into common.h).


    • Joe Robbins

      Joe Robbins - 2014-08-07

      Hello Simon
      I had spotted the <<new>> way of harvesting environmental variables but left it on one side. Will commit for this in next few days.

      With regard to header files. I stated my views in a post a few months ago. I am no expert, but it seems to me that it helps if modules expose a PUBLIC and a PRIVATE interface. While all fileio was in one file, the public interface was defined in common.h; the PRIVATE was hidden by placing (forward) headers inside fileio.c. With fileio "modularised", I wanted to distinguish between public (i.e. called from outside fileio) and private functions. Hopefully, future hackers working on fileio can safely assume they can change anything (without regard to other modules) so long as they preserve the PUBLIC interfaces, which are all declared in fileio.h fileio-call.h and fileio-sort.h. Anyway, a .h per module "feels right".

      The problem - using C - is the complexity, circularity and duplication that can arise when including .h files. Sometimes the "easy" answer is to ram it all into one big super header file. But often this demonstrates a lack of structure to the software components themselves.

      I edited libcob.h to include the 3 PUBLIC header files from fileio. I removed all fileio related declarations that I could see from common.h. I hope this arrangement can be preserved into the final release.

      One last thought on header files. I tend to hold the opinion that each .h file should include ALL the .h files on which it is dependent. For example: if abc.h has a declaration such as "FILE *fp;" then it should have "#include <fileio.h>". It should not rely on client programs having to code: "#include <fileio.h>" and ensure it comes earlier in the code than "#include "abc.h". But I haven't gone any further with the GNU-COBOL header files than to accommodate "new" fileio.

      Regards: Joe Robbins

  • Brian Tiffin

    Brian Tiffin - 2014-08-09


    There is a bit of a catch-22 now.

    config/default.conf isn't read before install and this happens during the build of CBL_OC_DUMP in /extras, (where cobc/cobc is the new)

    cobc/cobc ...
    default.conf: No definition of 'use-sparse-indexed-keys'
    cobc: Error: Failed to load the initial config file

    but, if the new use-sparse-indexed-keys is added to /usr/local/share/gnu-cobol/config/ then all the other builds of GNU Cobol get

    cobc -m -Wall -std=mf -O extras/CBL_OC_DUMP.cob 
    mf.conf:142: Unknown tag 'use-sparse-indexed-keys'
    cobc: Error: Invalid option -std=mf

    I think the fileio rewrite is going to have to be more forgiving if the config option is missing. Otherwise it'll get scary.

    I think.


    • Simon Sobisch

      Simon Sobisch - 2014-08-09

      No, config.c shouldn't be forgiving. There was a problem with missing conf entries before that I have fixed.
      If the problem persists we have a problem in the Makefile/atlocal as the source configuration should be used, not the configuration that is installed.
      Please do svn update and try to reproduce the issue.


      • Brian Tiffin

        Brian Tiffin - 2014-08-17


        I'll run updates and checks.



Log in to post a comment.