I am writing a fixed line sequential file on Linux using EXTFH. Unfortuntely, I am not getting LF characters at the end of each line.
I am setting the following values:
move fcd--line-sequential-org to FCD-ORGANIZATION
move fcd--recmode-fixed to FCD-RECORDING-MODE
move LS-FILEDEF-LRECL to FCD-MIN-REC-LENGTH, FCD-MAX-REC-LENGTH
The output file records are complete, except for the LF characters. Am I doing something wrong here? If not, I will attempt to debug and send a patch, but I figured I would ask before making the effort in case this is a FAQ.
FYI - this works fine with MF Cobol (Server Express 5.1).
Eric Raskin
eraskin at paslists dot com
Last edit: Eric Raskin 2018-12-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was just killing some time when I read your post. I have not worked with EXTFH very much but it is my understanding that the addition of a LF character to an all-text record implies variable length records. That would call into question your second line of code where it appears that you are setting fixed length record mode. By definition fixed length means that all the records in the file are the same length hence there is no need for the terminating LF character.
I suggest that you set the 'recmode' to the variable option. Your third line will create a file with all the records the same length (plus a LF) which may or may not be what you want.
Gregory
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
INPUT-OUTPUT SECTION.
FILE-CONTROL.
*
SELECT TEXTFILE-F ASSIGN TO DISC TEXTFILE-F-NAME
ORGANIZATION LINE SEQUENTIAL
ACCESS SEQUENTIAL
STATUS DASDSTAT.
*
SELECT TEXTFILE-V ASSIGN TO DISC TEXTFILE-V-NAME
ORGANIZATION LINE SEQUENTIAL
ACCESS SEQUENTIAL
STATUS DASDSTAT.
*
SELECT TEXTFILE-X ASSIGN TO DISC TEXTFILE-X-NAME
ORGANIZATION LINE SEQUENTIAL
ACCESS SEQUENTIAL
STATUS DASDSTAT.
...
DATA DIVISION.
FILE SECTION.
*
FD TEXTFILE-F LABEL RECORD STANDARD.
01 TEXTFILE-F-RECD PIC X(999).
FD TEXTFILE-V RECORD VARYING 1 TO 999 CHARACTERS
DEPENDING ON TEXTFILE-V-RLEN
LABEL RECORD STANDARD.
01 TEXTFILE-V-RECD PIC X(999).
*
FD TEXTFILE-X LABEL RECORD STANDARD.
01 TEXTRECD-X-0009 PIC X(009).
01 TEXTRECD-X-0099 PIC X(099).
01 TEXTRECD-X-0999 PIC X(999).
The definition for TEXTFILE-F will create a file of text records that are
all the same length (999) and will include the terminating LF (total 1000).
The definition for TEXTFILE-V will create a file with text records that
are the length specified in TEXTFILE-V-LEN at the time the WRITE verb
executes. These records will also include the terminating LF.
The definition for TEXTFILE-X will create a file with text records that
are the length of the record name written ie:
- 'WRITE TEXTFILE-X FROM TEXTRECD-X-0009' will write 9 characters plus the LF.
- 'WRITE TEXTFILE-X FROM TEXTRECD-X-0099' will write 99 characters plus the LF.
- 'WRITE TEXTFILE-X FROM TEXTRECD-X-0999' will write 999 characters plus the LF.
The fields you are loading in your example should be set by the compiler while doing its interpretation of your Cobol source code. For example the minimum and maximum record length values are gleaned from the 'RECORD VARYING ...' clause by the compiler.
Hope this helps.
Gregory
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Please see the attachment. A simple program to output the numbers 1 through 10 in a fixed line sequential file. I set the record size to 10 characters, so I expect 7 trailing spaces followed by a linefeed for each record. Instead I get all 10 numbers on one line.
See the test.out file included in the zip.
I also included a small test.sh to compile and execute the test.
I've reproduced the issue and was able to fix it (some tests are pending, your sample works as expected).
The background:
LSQ files had only "always trailing spaces removed" or "never" (depending on a runtime configuration flag); rw-branch added "depending on a runtime configuration flag for the specific file" - but those don't help with EXTFH
as GnuCOBOL generates an internal FCD for LSQ files with record_min = 0 I've adjusted the removing of trailing spaces to use the FCD's minimal length - leads to 7 trailing spaces in your sample
The missing line break is rooted in the way GnuCOBOL handles WRITE BEFORE and WRITE AFTER - is there any option to use these (with lines/pages) with FCD? If yes: how?
If no internal option is specified I've currently used BEFORE 1 LINE internal - this likely needs a tweak before commit...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Does the COBOL standard make a distinction between 'ASSIGN TO DISK/ORGANIZATION LINE SEQUENTIAL' (DLSQ) and 'ASSIGN TO
PRINTER/ORGANIZATION LINE SEQUENTIAL' (PLSQ)? If not, it should because
the control characters that are placed in each record for each type of
file differ in placement and which characters are allowed.
In a direct access storage device (ie disk) DLSQ file the only non-text
characters allowed in the data should be the 'Cr' and/or 'Lf' which are
placed at the end of the data record only. The end of record sentinel is
'CrLf' for Windoze and 'Lf' for linix. Also the 'WRITE/BEFORE' and
'WRITE/AFTER' variants should probably throw a warning message on DLSQ
definitions at compile time. They may work but are illogical.
In a PLSQ file some additional non-text characters are allowed. These are
'Ff' (0x0C) and Tab (0x09). Also 'WRITE/BEFORE' and 'WRITE/AFTER' verbs
are allowed and must be considered when deciding where to place the 'Cr'
and/or the 'Lf' characters. Some considerations are:
a) The first character written to a print file should be a 'Cr' character.
It sets the device buffer index to column 1. I presume that at 'OPEN'
time is when this should occur. This should happen regardless of 'OPEN
OUTPUT' or 'OPEN EXTEND'.
b) The last character placed into the device buffer as the result of each
'WRITE' verb should be the 'Cr' character only. The 'Cr' sets the device
buffer index to column 1. Do NOT add 'CrLf' (unless Windoze) as that
forces a line feed where one may not be desired.
c) 'WRITE PRINTREC AFTER n LINES':
- 'n' can be zero
- the record written to storage would look like this assuming n=3:
- - Lf Lf Lf <the text="" be="" printed="" to=""> Cr</the>
d) 'WRITE PRINTREC BEFORE n LINES':
- 'n' can be zero
- the record written to storage would look like this (n=3):
- - <the text="" be="" printed="" to=""> Lf Lf Lf Cr</the>
e) 'WRITE PRINTREC AFTER PAGE':
- the record written to storage would look like this:
- - Ff <the text="" be="" printed="" to=""> Cr</the>
f) 'WRITE PRINTREC BEFORE PAGE':
- the record written to storage would look like this:
- - <the text="" be="" printed="" to=""> Ff Cr</the>
g) 'CLOSE PRINTFILE':
- The PLSQ storage should be written with 'CrLf'.
Anyway this is how I believe things should work in my world.
G
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
LINE SEQUENTIAL was never defined in the COBOL standard. But from the discussion of the last OWG-3 meeting I'm sure 202x will finallly get it in, possibly with nice extensions that no COBOL runtime has currently (GnuCOBOL will get them very early ;-) including the option to adjust the padding/rtrim character (for example to EBCDIC space) and to adjust the record separator (for example to fixed CrLf).
Most (all?) compilers allow "anything" for these files to write, only the ones that allow (optional?) escape characters (for example for the line-feed, for hex-null, ...) can also read the data back in (otherwise you have the possibility to get a "bad" read for example a COMP-field)
Note: many compilers distinguish line-sequential files from print-files. In this case the second type (often specified by ASSIGN TO PRINTER[-...]) often implies some stuff (at least similar to what you have described).
Real print files will often have escape sequences in (formatting, sizing, spacing, change encoding, ...within the printer).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I did some more testing on MF Cobol with my little test program. Simon is correct in that trailing spaces are removed based on a runtime configuration and not an FCD setting. My test program did indeed output 3 character lines and not the 10 I expected.
STRIPSPACE
When performing WRITE or REWRITE operations on line sequential or line advancing files, the STRIPSPACE option determines whether to remove trailing space characters.
STRIPSPACE={ON|OFF}
Default: ON
However, it did insert linefeed characters as I reported originally. And I know that's what you are working on now.
I wish I knew more about the COBOL standard to answer your question, but I just have empirical evidence based on how I've used the language over the years. I have never actually seen a leading CR as the first character in a PLSQ file. I dumped a print file I have around. I don't see a leading CR on the first line. I do see a trailing CR on my output lines - even on Linux (and not just "Windoze"). I also see a trailing CR directly before an FF, which implies to me that Gregory is correct about every line ending with a CR on every platform.
So, other than the leading CR in the first record, Gregory appears to be correct IMHO.
Last edit: Eric Raskin 2018-12-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Eric, just a quick question. Does EXTFH affect display? We write files (piped) to the next program in the job stream scripts (linux). Depending on if we want a line feed (LF) or not, we use or not use the ' display no advancing' . We get a LF with just ' display' , and no LF using the 'display no advancing'. Just wondering if EXTFH changes this... ?
And as far as I can see, there is no CR on either.
Last edit: Mickey White 2018-12-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As far as I know, it does not. DISPLAY is sent to a "screen" (console) and not to a "file". You have to specifically enable using EXTFH (external file handler). It is really for special cases, like accessing COBOL file handling from C code or writing your own file handler.
The regular COBOL file handling code should be unaffected by whatever changes we make here, but that is really for Simon to confirm (as he is making the changes). And none of it should affect DISPLAY handling.
In our case, it gets used due to some really old software we have transferred from a mainframe. It uses 80 character card images for parameters. One set of parameters defines file formats and record lengths. The developers chose to use EXTFH so they could manipulate the FCD directly to create the files as required.
Last edit: Eric Raskin 2018-12-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have you test program.
That is certainly the hard way to create a print file...
But I will investigate, why it is not inserting the LF but this may not happen until January.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks. Not really a print file. We deal with data files full of names/addresses. Many of them arrive as line sequential files so we need to read/write that format. I am just looking to migrate off our proprietary compiler to OpenCobol, so it is not a rush.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've done some testing and more investigation and found that I had completely missed copying the various flags related to LINE SEQUENTIAL files between the FCD3 structure and the internal GnuCOBOL cob_file structure.
(I have only ever used EXTFH with Microfocus for INDEXED files.)
(I can add that but may not get it done until January.)
None of the runtime options for file formats have been merged into the 'trunk' yet.
This was added to 'reportwriter' branch around 18 months ago and gives you a lot of control over how files are formated.
When I make a fix for EXTFH related to LINE SEQUENTIAL files, it will be in the 'reportwriter' branch.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks Ron for fixing the new part of [bugs:#561] with [r2804]. I've just merged this change to trunk with [r2805].
There's one thing I'm unsure about: MF seems to completely ignore the minimal record length on (extfh) WTITE (and always pad up to the maximum record length depending on a flag).
I'd like to change that to allow the programmer to specify the minimal length.
Should this always be done or do we need a runtime configuration for this?
(note; a cobc created line-sequential file definition has a record length of 0 therefore the behaviour would not be changed [until we allow the programmer to specify the minimal length])?
The testsuite has the following EXTFH entries now: ISAM (with callback handler), sequential, and line-sequential. Does someone have a relative sample to add to the testsuite (or can create + test it on MF and later recheck that GnuCOBOL handles this correctly)?
Just seen in the sample program: FCD-CONFIG-FLAGS2 has fcd--file-has-write-after and fcd--file-has-write-before, what is this originally used for with MF?
In general there are a lot of options we don't set/read/define at all in the FCD structure, I'm quite sure we could handle some (and some more in rw-branch, for example the "lock" parts or a "per file" access user stat [default would be COB_FILE_MODE]).
@Ron: Can you have a look at this please (maybe starting with lock parts, then compare the defines in the test sample with what is currently defined in common.h and furthermore what parts are used)?
Hi:
I am writing a fixed line sequential file on Linux using EXTFH. Unfortuntely, I am not getting LF characters at the end of each line.
I am setting the following values:
The output file records are complete, except for the LF characters. Am I doing something wrong here? If not, I will attempt to debug and send a patch, but I figured I would ask before making the effort in case this is a FAQ.
FYI - this works fine with MF Cobol (Server Express 5.1).
Eric Raskin
eraskin at paslists dot com
Last edit: Eric Raskin 2018-12-14
Hi Eric,
I was just killing some time when I read your post. I have not worked with EXTFH very much but it is my understanding that the addition of a LF character to an all-text record implies variable length records. That would call into question your second line of code where it appears that you are setting fixed length record mode. By definition fixed length means that all the records in the file are the same length hence there is no need for the terminating LF character.
I suggest that you set the 'recmode' to the variable option. Your third line will create a file with all the records the same length (plus a LF) which may or may not be what you want.
Gregory
Hi:
Yes - it is strange, but line sequential fixed is what we really intended. Trailing spaces and all, but with a linefeed at the end of each record.
Eric
Last edit: Simon Sobisch 2018-12-17
Eric, have a look at this:
The definition for TEXTFILE-F will create a file of text records that are
all the same length (999) and will include the terminating LF (total 1000).
The definition for TEXTFILE-V will create a file with text records that
are the length specified in TEXTFILE-V-LEN at the time the WRITE verb
executes. These records will also include the terminating LF.
The definition for TEXTFILE-X will create a file with text records that
are the length of the record name written ie:
- 'WRITE TEXTFILE-X FROM TEXTRECD-X-0009' will write 9 characters plus the LF.
- 'WRITE TEXTFILE-X FROM TEXTRECD-X-0099' will write 99 characters plus the LF.
- 'WRITE TEXTFILE-X FROM TEXTRECD-X-0999' will write 999 characters plus the LF.
The fields you are loading in your example should be set by the compiler while doing its interpretation of your Cobol source code. For example the minimum and maximum record length values are gleaned from the 'RECORD VARYING ...' clause by the compiler.
Hope this helps.
Gregory
Thanks. I am sure this works perfectly. But I am using the EXTFH interface. Unfortunately, it does not work there. That's the whole problem. ;-)
I need some sample code that shows exactly what you are doing.
Please note that at present GnuCOBOL does not support ADDRESS OF FH--FCD OF filename
(In reportwriter there are runtime options that can be used to control how the files are created.)
Please see the attachment. A simple program to output the numbers 1 through 10 in a fixed line sequential file. I set the record size to 10 characters, so I expect 7 trailing spaces followed by a linefeed for each record. Instead I get all 10 numbers on one line.
See the test.out file included in the zip.
I also included a small test.sh to compile and execute the test.
Please let me know your thoughts.
I've reproduced the issue and was able to fix it (some tests are pending, your sample works as expected).
The background:
WRITE BEFORE
andWRITE AFTER
- is there any option to use these (with lines/pages) with FCD? If yes: how?If no internal option is specified I've currently used
BEFORE 1 LINE
internal - this likely needs a tweak before commit...Thanks for working on this. Any idea where it stands right now? Is there anything I can test yet?
I should be able to come back to this on the weekend, currently wait for a response to
:-)
Simon,
The following may muddy the waters a bit ...
Does the COBOL standard make a distinction between 'ASSIGN TO DISK/ORGANIZATION LINE SEQUENTIAL' (DLSQ) and 'ASSIGN TO
PRINTER/ORGANIZATION LINE SEQUENTIAL' (PLSQ)? If not, it should because
the control characters that are placed in each record for each type of
file differ in placement and which characters are allowed.
In a direct access storage device (ie disk) DLSQ file the only non-text
characters allowed in the data should be the 'Cr' and/or 'Lf' which are
placed at the end of the data record only. The end of record sentinel is
'CrLf' for Windoze and 'Lf' for linix. Also the 'WRITE/BEFORE' and
'WRITE/AFTER' variants should probably throw a warning message on DLSQ
definitions at compile time. They may work but are illogical.
In a PLSQ file some additional non-text characters are allowed. These are
'Ff' (0x0C) and Tab (0x09). Also 'WRITE/BEFORE' and 'WRITE/AFTER' verbs
are allowed and must be considered when deciding where to place the 'Cr'
and/or the 'Lf' characters. Some considerations are:
a) The first character written to a print file should be a 'Cr' character.
It sets the device buffer index to column 1. I presume that at 'OPEN'
time is when this should occur. This should happen regardless of 'OPEN
OUTPUT' or 'OPEN EXTEND'.
b) The last character placed into the device buffer as the result of each
'WRITE' verb should be the 'Cr' character only. The 'Cr' sets the device
buffer index to column 1. Do NOT add 'CrLf' (unless Windoze) as that
forces a line feed where one may not be desired.
c) 'WRITE PRINTREC AFTER n LINES':
- 'n' can be zero
- the record written to storage would look like this assuming n=3:
- - Lf Lf Lf <the text="" be="" printed="" to=""> Cr</the>
d) 'WRITE PRINTREC BEFORE n LINES':
- 'n' can be zero
- the record written to storage would look like this (n=3):
- - <the text="" be="" printed="" to=""> Lf Lf Lf Cr</the>
e) 'WRITE PRINTREC AFTER PAGE':
- the record written to storage would look like this:
- - Ff <the text="" be="" printed="" to=""> Cr</the>
f) 'WRITE PRINTREC BEFORE PAGE':
- the record written to storage would look like this:
- - <the text="" be="" printed="" to=""> Ff Cr</the>
g) 'CLOSE PRINTFILE':
- The PLSQ storage should be written with 'CrLf'.
Anyway this is how I believe things should work in my world.
G
Just a note to this:
LINE SEQUENTIAL
was never defined in the COBOL standard. But from the discussion of the last OWG-3 meeting I'm sure 202x will finallly get it in, possibly with nice extensions that no COBOL runtime has currently (GnuCOBOL will get them very early ;-) including the option to adjust the padding/rtrim character (for example to EBCDIC space) and to adjust the record separator (for example to fixed CrLf).Most (all?) compilers allow "anything" for these files to write, only the ones that allow (optional?) escape characters (for example for the line-feed, for hex-null, ...) can also read the data back in (otherwise you have the possibility to get a "bad" read for example a COMP-field)
Note: many compilers distinguish line-sequential files from print-files. In this case the second type (often specified by
ASSIGN TO PRINTER[-...]
) often implies some stuff (at least similar to what you have described).Real print files will often have escape sequences in (formatting, sizing, spacing, change encoding, ...within the printer).
Wow, sorry to open such a can of worms! :-)
I did some more testing on MF Cobol with my little test program. Simon is correct in that trailing spaces are removed based on a runtime configuration and not an FCD setting. My test program did indeed output 3 character lines and not the 10 I expected.
However, it did insert linefeed characters as I reported originally. And I know that's what you are working on now.
I wish I knew more about the COBOL standard to answer your question, but I just have empirical evidence based on how I've used the language over the years. I have never actually seen a leading CR as the first character in a PLSQ file. I dumped a print file I have around. I don't see a leading CR on the first line. I do see a trailing CR on my output lines - even on Linux (and not just "Windoze"). I also see a trailing CR directly before an FF, which implies to me that Gregory is correct about every line ending with a CR on every platform.
So, other than the leading CR in the first record, Gregory appears to be correct IMHO.
Last edit: Eric Raskin 2018-12-19
Eric, just a quick question. Does EXTFH affect display? We write files (piped) to the next program in the job stream scripts (linux). Depending on if we want a line feed (LF) or not, we use or not use the ' display no advancing' . We get a LF with just ' display' , and no LF using the 'display no advancing'. Just wondering if EXTFH changes this... ?
And as far as I can see, there is no CR on either.
Last edit: Mickey White 2018-12-19
As far as I know, it does not. DISPLAY is sent to a "screen" (console) and not to a "file". You have to specifically enable using EXTFH (external file handler). It is really for special cases, like accessing COBOL file handling from C code or writing your own file handler.
The regular COBOL file handling code should be unaffected by whatever changes we make here, but that is really for Simon to confirm (as he is making the changes). And none of it should affect DISPLAY handling.
In our case, it gets used due to some really old software we have transferred from a mainframe. It uses 80 character card images for parameters. One set of parameters defines file formats and record lengths. The developers chose to use EXTFH so they could manipulate the FCD directly to create the files as required.
Last edit: Eric Raskin 2018-12-19
I have you test program.
That is certainly the hard way to create a print file...
But I will investigate, why it is not inserting the LF but this may not happen until January.
Thanks. Not really a print file. We deal with data files full of names/addresses. Many of them arrive as line sequential files so we need to read/write that format. I am just looking to migrate off our proprietary compiler to OpenCobol, so it is not a rush.
I've done some testing and more investigation and found that I had completely missed copying the various flags related to LINE SEQUENTIAL files between the FCD3 structure and the internal GnuCOBOL cob_file structure.
(I have only ever used EXTFH with Microfocus for INDEXED files.)
(I can add that but may not get it done until January.)
None of the runtime options for file formats have been merged into the 'trunk' yet.
This was added to 'reportwriter' branch around 18 months ago and gives you a lot of control over how files are formated.
When I make a fix for EXTFH related to LINE SEQUENTIAL files, it will be in the 'reportwriter' branch.
Now fixed in reportwriter:
/home/rjn/gnucobol/branches/reportwriter:64 bit:80 >svn commit -m'Fix EXTFH for LINE SEQUENTIAL'
Password:
Sending libcob/ChangeLog
Sending libcob/fileio.c
Sending tests/testsuite.src/run_misc.at
Transmitting file data ...
Committed revision 2804.
Thanks Ron for fixing the new part of [bugs:#561] with [r2804]. I've just merged this change to trunk with [r2805].
There's one thing I'm unsure about: MF seems to completely ignore the minimal record length on (extfh)
WTITE
(and always pad up to the maximum record length depending on a flag).I'd like to change that to allow the programmer to specify the minimal length.
Should this always be done or do we need a runtime configuration for this?
(note; a cobc created line-sequential file definition has a record length of 0 therefore the behaviour would not be changed [until we allow the programmer to specify the minimal length])?
The testsuite has the following EXTFH entries now: ISAM (with callback handler), sequential, and line-sequential. Does someone have a relative sample to add to the testsuite (or can create + test it on MF and later recheck that GnuCOBOL handles this correctly)?
Just seen in the sample program:
FCD-CONFIG-FLAGS2
hasfcd--file-has-write-after
andfcd--file-has-write-before
, what is this originally used for with MF?In general there are a lot of options we don't set/read/define at all in the FCD structure, I'm quite sure we could handle some (and some more in rw-branch, for example the "lock" parts or a "per file" access user stat [default would be
COB_FILE_MODE
]).@Ron: Can you have a look at this please (maybe starting with lock parts, then compare the defines in the test sample with what is currently defined in common.h and furthermore what parts are used)?
Simon
Related
Bugs:
#561I just downloaded trunk r2806 and I am pleased to report my issue was repaired. Line Sequential files output with a linefeed on Linux as we expected.
Thank you all so much and Happy Holidays!