Menu

#604 Line sequential problem

unclassified
open
None
5 - default
2021-11-10
2019-10-27
cdg
No

The Windows convention for "line sequential" files is to terminate each line with x"0d0a" (carriage-return+line-feed). There are several exceptions to this:

1) If the program specifies "WRITE BEFORE ADVANCING PAGE", the (previous) line terminator will be x"0d0c" (carriage-return+form-feed) or x"0c0d" (form-feed+carriage-return).

2) If the program specifies "WRITE AFTER 0 LINES", the (previous) line terminator will be x"0d" (carriage-return) without the x"0a" (line-feed). ("WRITE AFTER 0 LINES" is most frequently used for underscoring by overprinting the underscore-line.)

GNU Cobol appears to create the correct line-terminators corresponding to the write statements. However, if the file is written to disk (rather than sent to a printer) and then read by a GNU Cobol program, it ignores the x"0d" line terminator, and treats the overprint line as a continuation of the previous line.

Attached is a test program to illustrate this.

The program output follows:

WRITING:
Line 1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
____________________________________________
Line 2 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

READING:

Line 1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_______________________________
Line 2 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
EOF

Hex Dump of Testfile:
--------------------
0D0A
4C696E65203120787878787878787878787878787878787878787878787878787878787878787878787878780D
5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F5F0D0A
4C696E65203220787878787878787878787878787878787878787878787878787878787878787878787878780D0A

Note that the first line is preceded by x"0d0a" due to "write after 1 line", but it is terminated by x"0d" (not "x0d0a") because the subsequent write is "after 0 lines". That line is terminated by x"0d0a" because the NEXT line is "write after 1 line". And THAT line is terminated by x"0d0a" because that is apparently generated by the CLOSE statement.

[I'm aware that the Linux convention is to terminate a line with x"0a" only, but shouldn't a Windows build employ the Windows convention? And, it appears to do so when writing the file, so why not when reading it?]

1 Attachments

Discussion

  • cdg

    cdg - 2019-10-27

    Correction: "WRITE BEFORE ADVANCING PAGE" (above) should read "WRITE AFTER ADVANCING PAGE"

     
  • Simon Sobisch

    Simon Sobisch - 2019-10-27

    If I understood you correctly you'd expect the x0d to be read as line separator instead of x0a and removing the x0a completely, correct? Or "always terminate the line when one of x"0a", x"0d"or x"0d0a" is found?

    I'm not aware of any COBOL environment doing this, as it would prevent you to recognize in the program that you had an "overwrite" before.
    [in this case the first character in the "overwrite line" must be read in as x0d, is this the case? (You may check with CALL "CBL_OC_DUMP USING fd-record)]

     
  • cdg

    cdg - 2019-10-27

    The following are valid record-delimiters (line-separators) for DOS and Windows print files:

        CR/LF (x"0d0a") - Normal case of a single-spaced line 
                                                               (a multiply-spaced line would have several CR/LF's)
        CR (x"0d)alone  - Specified that the following line is to be overprinted
        CR/FF (x"0d0c") - Form feed after record 
    

    So, any of the three will terminate the line. As to x"0a" by itself, that is the Linux convention. Windoze programs don't use it.

    I'm not aware of any COBOL environment doing this, as it would prevent you to recognize in the program that you had an "overwrite" before.

    To my knowledge, EVERY Cobol environment does this, even GNU Cobol --- on output. Realia Cobol and Realia Cobol II do it on input as well. Similarly, most Windoze text editors (e.g. Notepad, Notepad++, PSPad, Beyond Compare, etc.) treat all three as valid line-separators, albeit they don't perform an overprint.

    But your point is valid, that the program reading the print file loses recognition of the overprint (or form feed) . That hasn't been an issue for me --- yet, as most of my programs use the IBM (CTLCHR=ASA) format until they actually send data to the printer.

    Programming around it, in the one program that is affected, would mean doubling the defined record size for the input file, and scanning for the x"0d" and line following it. The problem is that GNU Cobol removes the x"0d" from the record! So I would have to read the file as a continuous stream, and do my own deblocking!

    I don't see "CBL_OC_DUMP" in the programmer's guide. What does it do?

     

    Last edit: cdg 2019-10-27
    • Arnold Trembley

      Arnold Trembley - 2019-10-28

      CBL_OC_DUMP is a callable subprogram in your "extras" folder that is used to print a hexdump of memory.

       Directory of C:\GC31-BDB\extras
      
      09/01/2019  11:02 PM    <DIR>          .
      09/01/2019  11:02 PM    <DIR>          ..
      08/28/2019  12:39 PM             9,743 CBL_OC_DUMP.cob
      09/01/2019  10:26 PM            17,381 CBL_OC_DUMP.dll
      09/01/2019  10:24 PM            18,241 Makefile
      08/28/2019  12:39 PM             1,303 Makefile.am
      08/28/2019  12:45 PM            18,066 Makefile.in
      08/28/2019  12:39 PM               516 README
      

      The most current version was written by Brian Tiffin.

       
      • Simon Sobisch

        Simon Sobisch - 2019-10-28

        The extras directory should also be mentioned in the README.

         
  • Simon Sobisch

    Simon Sobisch - 2021-11-10
    • assigned_to: Ron Norman
     
  • Simon Sobisch

    Simon Sobisch - 2021-11-10

    Assigned to Ron as he recently made adjustments related to that.
    @rjnorman74 please check and give a short notice, so we may be able to close that old bug report soon.

     
  • Ron Norman

    Ron Norman - 2021-11-10

    Although I could make GnuCOBOL return the data stopping at the CR, I have tested this using Microfocus and Microfocus works exactly the same as GnuCOBOL. The read back using MF is:
    READING:

    Line 1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_______


    Line 2 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    EOF

     

Log in to post a comment.

MongoDB Logo MongoDB