Yep - looks likely - another 'untested' change to fileio.c. perhaps
geez - the MF people sure have influence here don't they.??????
By the way - @jamesbwhite what's the ? for - you didn't understand the problem or the reason for deleting the discussion (since it's now in 'bugs') or what ??.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I would have expected a status code X'9F' Malformed line sequential-file. 18, or the catch all 30.
If it is line sequential you could make the LRECL 32,765. It doesn't matter.
Of course if it is not a line sequential file then LRECL does matter.
A LSEQ record is delimited by the presence of x"0D0A' or whatever it is in Linux.
I would honestly regard this as programmer error. But if no status code raised then a bug.
Last edit: Ralph Linkletter 2021-10-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ralph - I don't use status codes on this particular file & thus would expect the program to just crash - but it doesn't.
There's a minimal program and sample input file in bug 477 - you tell me where the problem lies - me or the compiler.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi David.
Just for grins I ran the your program with Fujitsu - had to remove the free format source and get rid of the repository stuff. Then added status code. The execution was the same as the execution you documented.
No status code other than '00' and '10' - not what I expected.
I then ran the program under MF - same results.
I could not run it under zOS - no record format of line sequential, no free form, no repository.
Windows / Linux execution of programs containing line sequential formats apparently are not constrained by over running the FD allocation with a record size greater than than what the FD allocates.
IMO very weird - but I would guess that this PC/Linux behavior has been used for decades.
I would have thought a status code of X'9F' (Malformed line sequential-file) would be signaled - given that the CR/LF was not found with the bounds of the FD allocation.
MF and Fujitsu responded in the same manner as GnuCOBOL.
Go figure -What do I know :-)
Ralph
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, the root cause is the program that wrote the data file with different record lengths. Unless that is in the specs, then I do also think the program should abend for different record lengths that are not defined as variable.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Surely it would be better to have both versions 3.2 & 4.x do the same thing - ie: either both default to true (partial record read on) or false (records truncated) rather than having a mish/mash.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You don't work for Microsoft do you - that seems to be their type of thinking.
Then again - having made NO comment about what appears to be a code error - should I be surprised - I realise you possibly didn't actually do the code change but it doesn't seem to have been particularly well tested - certainly not under Windows.
I guess that sort of comment gets me put in the naughty corner & nothing gets changed.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There still seems to be something wrong with the partial record read of 4.x - at least in my view.
I have attached the testsuite source data, prog.cob, testfile and two standard out files.
One is from the actual Mingw test number 782 - LINE SEQUENTIAL record truncation - which is the file named stdoutm and the second is from my version of that same test which uses the same prog.cob & testfile and then compares the stdout file from my Windows version ( called stdoutw).
The Mingw output has the last record shown as 00 (e ) whereas the Windows version has the content 00 (cde ) which tends to indicate to me that the record truncation may well be truncating at the correct place - but the next record (partial) is starting 2 characters back from where it should.
Just compile the program & then run it - you'll see the output I'm talking about - under Windows of course.
Please look at my latest response to BUG 774 - this truncation error is occurring with ver 3.2 after changing the .cfg & it's not just a documentation error.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There's distinctly something wrong with this section of code in fileio.c - maybe it's 'assuming' there'll always be a CRLF involved. I had a play & got it to work correctly by NOT doing the -k in the fseek but doing a +1 instead - but that only worked for 1 specific case. It needs a rewrite. 3.2 & 4.x
if (likely(i < f->record_max)) { dataptr++ = (unsigned char)n;
i++;
if (i == f->record_max
&& (cobsetptr->cob_ls_split)) {
/ If record is too long, then simulate end
* so balance becomes the next record read /
int k = 1;
int z = 1; << I added this
n = getc ((FILE )f->file);
if (n == '\r') {
n = getc ((FILE )f->file);
k++;
} and changed this from -k to +z
if (n != '\n') { v
fseek((FILE)f->file, +z, SEEK_CUR);
}
break;
}
}
The somewhat strange thing is that 'again' this specific test does NOT fail during the testsuite tests - possibly because the test does NOT use the ls_split option either in 3.2 or 4.x.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't know who you think works for MSFT ?
It seems to me that the issue here is a combination of a programmer snafu and a compiler issue.
Does it really matter?
It seems that you are asking the compiler to perpetuate inconsistent logic for what in my mind is illogical from the applications perspective.
Expectation of a defect to be resolved by the compiler is a bit of a stretch from a compiler development perspective. Agreed ?
My perspective is that it is an ill-formed line sequential file - a status code should be returned.
If the status code is not used(why ignore it) then the application will execute illogically
If it the intent of the application is to process records that have no consistent format I would expect use of these CBL kind of routines:
open-in-file section.
call "CBL_OPEN_FILE" using infilename
k-1
k-0
k-0
infile-handle
if return-code not = 0
display "Open failed: "
infilename
stop run
end-if
Find length of input file.
call "CBL_READ_FILE" using infile-handle
infile-offset
in-buff-end
k-128
in-buff
move infile-offset to infile-len
move 0 to infile-offset
move in-buff-len to in-buff-ptr
add 1 to in-buff-ptr
move fals to last-block-flag
move fals to eof-flag
Loop this until you encounter a line sequential termination X'0D0A' (windows)
** read data
call "CBL_READ_FILE" using infile-handle
infile-offset
in-buff-end
k-128
in-buff
Or simply increase the FD allocation for the record size.
Seems a wee bit like tilting at windmills.
Ralph
.
Last edit: Ralph Linkletter 2021-10-16
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ralph, It seems to me that the issue here is a combination of a programmer snafu and a compiler issue.
OK - From what I understand - this change to allow what's left over (on a longer than FD described record) to be delivered to the following record - is a 202x STANDARD.
If it's in the STANDARD - then I 'assume' someone wants to use it.
It therefore doesn't actually matter that the original record that's too long was produced by some crappy programmer - all that matters is that the compiler recognise the problem and deal with it 'correctly'.
If the .cfg has an option to either truncate the record (which was the only option up till now) or dish out the following data in some haphazard manner (which is what happens now) then where do you think the problem might lie.
To be honest - I couldn't give a 'rats' about where the problem is - I was dealing with a datafile I produced from the testsuite - which happened to have no 0D or 0A's in it - hence it was a long record.
If version 3.2 dealt with it correctly but 4.x didn't then I went looking. And found a fault.
I also don't really care if the fault never gets fixed - there are a number of faults I've found and they rarely get sorted out - so I end up either bypassing them or fixing them in my own builds.
I'll set the .cfg to LS_SPLIT = FALSE so that 3.2 doesn't care & wont split the file and 4.x (which defaults to splitting now - will care & won't split either.
3.2 is 'supposed' to be the next best thing released - but if it's faulty then someone should give a damn.
OK rant over - still waiting for someone to even recognise there's a problem.
Given that 75+% of desk computers run Windows - I'd have thought someone would actually care.
Have a nice day :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
David if this is the case ... I was dealing with a data file I produced from the testsuite - which happened to have no 0D or 0A's in it - hence it was a long record.....
You need byte stream I-O. Byte stream I-O is a useful primitive I-O method.
You are in control of what is read, for N bytes, no record termination criteria.
I use it all the time.
For instance, with a Micro Focus file with a 128 byte header before the actual data, I read the first 128 bytes, locate the record length in the header, then merrily read the byte stream for the record length stored in the header. I c an then write fixed length GnuCOBOL sequential records. I also use it for Micro Focus conversion of VRECGEN and VRECGEN2 file formats to GnuCOBOL formats.
Ralph
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
On 13/10/2021 13:47, David Wall wrote:
Done.
All, see bug 774.
Vince
Last edit: Vincent (Bryan) Coen 2021-10-13
It seems like this would also be related to Feature Request #405:
https://sourceforge.net/p/gnucobol/feature-requests/405/
Yep - looks likely - another 'untested' change to fileio.c. perhaps
geez - the MF people sure have influence here don't they.??????
By the way - @jamesbwhite what's the ? for - you didn't understand the problem or the reason for deleting the discussion (since it's now in 'bugs') or what ??.
I would have expected a status code X'9F' Malformed line sequential-file. 18, or the catch all 30.
If it is line sequential you could make the LRECL 32,765. It doesn't matter.
Of course if it is not a line sequential file then LRECL does matter.
A LSEQ record is delimited by the presence of x"0D0A' or whatever it is in Linux.
I would honestly regard this as programmer error. But if no status code raised then a bug.
Last edit: Ralph Linkletter 2021-10-14
Ralph - I don't use status codes on this particular file & thus would expect the program to just crash - but it doesn't.
There's a minimal program and sample input file in bug 477 - you tell me where the problem lies - me or the compiler.
Hi David.
Just for grins I ran the your program with Fujitsu - had to remove the free format source and get rid of the repository stuff. Then added status code. The execution was the same as the execution you documented.
No status code other than '00' and '10' - not what I expected.
I then ran the program under MF - same results.
I could not run it under zOS - no record format of line sequential, no free form, no repository.
Windows / Linux execution of programs containing line sequential formats apparently are not constrained by over running the FD allocation with a record size greater than than what the FD allocates.
IMO very weird - but I would guess that this PC/Linux behavior has been used for decades.
I would have thought a status code of X'9F' (Malformed line sequential-file) would be signaled - given that the CR/LF was not found with the bounds of the FD allocation.
MF and Fujitsu responded in the same manner as GnuCOBOL.
Go figure -What do I know :-)
Ralph
Ralph, you added a status code - did you try without the status code -& did it crash ??
With or with a status code - did not matter.
The program runs to completion.
Nary a crash.
Yes, the root cause is the program that wrote the data file with different record lengths. Unless that is in the specs, then I do also think the program should abend for different record lengths that are not defined as variable.
Note: the 4.x behavior is by design, matching COBOL 202x - that is a "partial read" with a warning status (0+something, so nonfatal, no abend)
Related commits:3.2: [r4338] (option to allow LS_SPLIT) and 4.x: [r4405] (now a default).
Related
Commit: [r4338]
Commit: [r4405]
So if David were to add a Status Code to the Select statement and running it with 4.x he should see a 0.X status code - correct ?
Yes, but I've not checked if that status depends on "something" already
Surely it would be better to have both versions 3.2 & 4.x do the same thing - ie: either both default to true (partial record read on) or false (records truncated) rather than having a mish/mash.
4.x is a "no need for compat" release and will also have a documentation about different defaults, there are several ones.
No comment about: https://sourceforge.net/p/gnucobol/discussion/help/thread/56bb9532f3/?limit=25#0f71
??
You don't work for Microsoft do you - that seems to be their type of thinking.
Then again - having made NO comment about what appears to be a code error - should I be surprised - I realise you possibly didn't actually do the code change but it doesn't seem to have been particularly well tested - certainly not under Windows.
I guess that sort of comment gets me put in the naughty corner & nothing gets changed.
Abends on the mainframe.
You have LINE SEQUENTIAL there?
I do think that if changing to SEQUENTIAL GC would also abend (possibly nicer/different with --debug).
no line seq... you are correct
There still seems to be something wrong with the partial record read of 4.x - at least in my view.
I have attached the testsuite source data, prog.cob, testfile and two standard out files.
One is from the actual Mingw test number 782 - LINE SEQUENTIAL record truncation - which is the file named stdoutm and the second is from my version of that same test which uses the same prog.cob & testfile and then compares the stdout file from my Windows version ( called stdoutw).
The Mingw output has the last record shown as 00 (e ) whereas the Windows version has the content 00 (cde ) which tends to indicate to me that the record truncation may well be truncating at the correct place - but the next record (partial) is starting 2 characters back from where it should.
Just compile the program & then run it - you'll see the output I'm talking about - under Windows of course.
Please look at my latest response to BUG 774 - this truncation error is occurring with ver 3.2 after changing the .cfg & it's not just a documentation error.
There's distinctly something wrong with this section of code in fileio.c - maybe it's 'assuming' there'll always be a CRLF involved. I had a play & got it to work correctly by NOT doing the -k in the fseek but doing a +1 instead - but that only worked for 1 specific case. It needs a rewrite. 3.2 & 4.x
if (likely(i < f->record_max)) {
dataptr++ = (unsigned char)n;
i++;
if (i == f->record_max
&& (cobsetptr->cob_ls_split)) {
/ If record is too long, then simulate end
* so balance becomes the next record read /
int k = 1;
int z = 1; << I added this
n = getc ((FILE )f->file);
if (n == '\r') {
n = getc ((FILE )f->file);
k++;
} and changed this from -k to +z
if (n != '\n') { v
fseek((FILE)f->file, +z, SEEK_CUR);
}
break;
}
}
The somewhat strange thing is that 'again' this specific test does NOT fail during the testsuite tests - possibly because the test does NOT use the ls_split option either in 3.2 or 4.x.
I don't know who you think works for MSFT ?
It seems to me that the issue here is a combination of a programmer snafu and a compiler issue.
Does it really matter?
It seems that you are asking the compiler to perpetuate inconsistent logic for what in my mind is illogical from the applications perspective.
Expectation of a defect to be resolved by the compiler is a bit of a stretch from a compiler development perspective. Agreed ?
My perspective is that it is an ill-formed line sequential file - a status code should be returned.
If the status code is not used(why ignore it) then the application will execute illogically
If it the intent of the application is to process records that have no consistent format I would expect use of these CBL kind of routines:
open-in-file section.
call "CBL_OPEN_FILE" using infilename
k-1
k-0
k-0
infile-handle
if return-code not = 0
display "Open failed: "
infilename
stop run
end-if
Find length of input file.
call "CBL_READ_FILE" using infile-handle
infile-offset
in-buff-end
k-128
in-buff
move infile-offset to infile-len
move 0 to infile-offset
move in-buff-len to in-buff-ptr
add 1 to in-buff-ptr
move fals to last-block-flag
move fals to eof-flag
Loop this until you encounter a line sequential termination X'0D0A' (windows)
** read data
call "CBL_READ_FILE" using infile-handle
infile-offset
in-buff-end
k-128
in-buff
Or simply increase the FD allocation for the record size.
Seems a wee bit like tilting at windmills.
Ralph
.
Last edit: Ralph Linkletter 2021-10-16
Ralph,
It seems to me that the issue here is a combination of a programmer snafu and a compiler issue.
OK - From what I understand - this change to allow what's left over (on a longer than FD described record) to be delivered to the following record - is a 202x STANDARD.
If it's in the STANDARD - then I 'assume' someone wants to use it.
It therefore doesn't actually matter that the original record that's too long was produced by some crappy programmer - all that matters is that the compiler recognise the problem and deal with it 'correctly'.
If the .cfg has an option to either truncate the record (which was the only option up till now) or dish out the following data in some haphazard manner (which is what happens now) then where do you think the problem might lie.
To be honest - I couldn't give a 'rats' about where the problem is - I was dealing with a datafile I produced from the testsuite - which happened to have no 0D or 0A's in it - hence it was a long record.
If version 3.2 dealt with it correctly but 4.x didn't then I went looking. And found a fault.
I also don't really care if the fault never gets fixed - there are a number of faults I've found and they rarely get sorted out - so I end up either bypassing them or fixing them in my own builds.
I'll set the .cfg to LS_SPLIT = FALSE so that 3.2 doesn't care & wont split the file and 4.x (which defaults to splitting now - will care & won't split either.
3.2 is 'supposed' to be the next best thing released - but if it's faulty then someone should give a damn.
OK rant over - still waiting for someone to even recognise there's a problem.
Given that 75+% of desk computers run Windows - I'd have thought someone would actually care.
Have a nice day :)
David if this is the case ... I was dealing with a data file I produced from the testsuite - which happened to have no 0D or 0A's in it - hence it was a long record.....
You need byte stream I-O. Byte stream I-O is a useful primitive I-O method.
You are in control of what is read, for N bytes, no record termination criteria.
I use it all the time.
https://lists.gnu.org/archive/html/gnucobol-users/2007-06/msg00025.html
For instance, with a Micro Focus file with a 128 byte header before the actual data, I read the first 128 bytes, locate the record length in the header, then merrily read the byte stream for the record length stored in the header. I c an then write fixed length GnuCOBOL sequential records. I also use it for Micro Focus conversion of VRECGEN and VRECGEN2 file formats to GnuCOBOL formats.
Ralph