Recently had the need to compare Isam files and found immense difficulty in comparing the files created by VBI211 & those of it's predecessor 201.
201 created files .dat & .idx with the .dat file having a size of the actual data recorded.
211 creates .dat files (on my 64bit pc) with a minimum size of 4096 irrespective of the data and padded out with nul 00's. which is a distinct PITA.
@sf-mensch - when you get round to making changes to VBI can you please pad the blocks out with space & not nul.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Recently had the need to compare Isam files and found immense
difficulty in comparing the files created by VBI211 & those of it's
predecessor 201.
201 created files .dat & .idx with the .dat file having a size of the
actual data recorded.
211 creates .dat files (on my 64bit pc) with a minimum size of 4096
irrespective of the data and padded out with nul 00's. which is a
distinct PITA.
The internal structures between version of a ISAM handler (or others) is
NOT guaranteed if at all, so you should have a way of back up the file
to a Seq or LS if possible before using the new version to copy it back.
Yes this means you have to run it before re-compiling and that I have
not found a way around.
Almost all my s/w if not all, does this via a parameter to a program. If
for no other reason if I have to move platforms.
This could also apply changing the version of the Cobol compiler and I
do not EVER "ASSUME".
If it doubt, cover it (the possible issue) in code :)
This is 60+ years programming talking :(
Vince
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Vince - whilst I appreciate the above - it's actually NOT what I'm kinda complaining about.
My only reason for using VBI over BDB is BECAUSE the VBI .DAT files are so easily read and displayed by mere text editors (because their index is a separate file).
Compared to the somewhat mish mash of BDB where their index is mangled up with the data.
However - VBI 201 seemed to only write the actual data to the .DAT file (NO extra padding) whereas VBI211 now writes the .DAT file out in either 1024 (32bit) or 4096 (64bit) chunks depending on which OS you have (32/64 bit) Win10 in my case.
AND - instead of writing Blanks (Hex20's) to pad the file - it's left as NULL (Hex 00's) which makes a right mess of trying to display the .DAT file on the screen.
I was just asking Simon (when he mods the VBI) to pad with spaces instead of nulls -for us dummies.
I realise not everyone needs to look at the .DAT files and compare the data with other files - but I do and it's not easy having all those nulls to wade thru. :)
And - I'm only up to 54 years of programming - so WAY behind you.!!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I guess you've tried the new version from Ron which creates C-ISAM compatible files by default (it can read and write both this one, which is used by MF and therefore useful as a "more portable format" and the "old" VBISAM format).
As long as you don't OPEN OUTPUT the file it will stay in the same format as before, V-ISAM will have the option to configure the "old" one as default, GC4 already has the option to explicit specify that the files should be in VBISAM format.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My basic point is that a change of version for a ISAM handler can change
for format of the data sorted or where.
In this case it is true - the programmer should have allowed for that
but there again if non professional programmers are doing the work . . .
Vince
On 06/01/2022 15:59, David Wall wrote:
Vince - whilst I appreciate the above - it's actually NOT what I'm
kinda complaining about.
My only reason for using VBI over BDB is BECAUSE the VBI .DAT files
are so easily read and displayed by mere text editors (because their
index is a separate file).
Compared to the somewhat mish mash of BDB where their index is mangled
up with the data.
However - VBI 201 seemed to only write the actual data to the .DAT
file (NO extra padding) whereas VBI211 now writes the .DAT file out in
either 1024 (32bit) or 4096 (64bit) chunks depending on which OS you
have (32/64 bit) Win10 in my case.
AND - instead of writing Blanks (Hex20's) to pad the file - it's left
as NULL (Hex 00's) which makes a right mess of trying to display the
.DAT file on the screen.
I was just asking Simon (when he mods the VBI) to pad with spaces
instead of nulls -for us dummies.
I realise not everyone needs to look at the .DAT files and compare the
data with other files - but I do and it's not easy having all those
nulls to wade thru. :)
And - I'm only up to 54 years of programming - so WAY behind you.!!
Closing up though :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But then again David and Vince, IBM has not "changed" VSAM since the release of OSVS/2 in 1974. VSAM replaced ISAM as an access technique. ISAM is not a supported COBOL access technique in IBM COBOL (unless the IBM is COBOL-D or COBOL-F - perhaps COBOL OSVS).
Different worlds I guess
Ralph
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What on earth has IBM got to do with GnuCobol and the supporting ISAM
structures.
ISAM = Indexed Sequential Access Method - this term is not original IBM
but goes back to the 60's and was in use by hmm lets try and remember -
ICT, ICL, English Electric, Burroughs, Honeywell CDC, may be Cray (can't
remember), Leo became EE then ICT et al. There are a lot of others but
at my age the grey cells are not working as well.
ISAM is a methodology and it is totally immaterial who coined it first
as every computer under the sun using a mass storage process that is
deemed as Fast using it one why or another - heck even my Mobile phone.
The method used by a piece of software used as a devise handler one why
or another and that includes BDB, VBisam etc has changes over its
development life even more so with many different programmers involved -
with or without adhering to what ever standard were in use when first
written if then !
Now one would thing that newer version of this type of s/w would be
backward compatible but I am afraid standard we are used to have dropped
assuming they were they in the beginning with some of them!.
I am saying that having looked at some of this code from C, C++ [just
about], Pascal, Basic, Cobol, Assembler, Macro Assembler, S3, RPG n,
Algol 68R to name but a few in my short life :) This was both as a
Programming Manager / Director and just as a home programmer looking at
old code and no I have no intention of doing the same with my old stuff
more than I have done with my ACAS system some of which code goes back
to around 1967.
.
Vince
On 06/01/2022 17:56, Ralph Linkletter wrote:
But then again David and Vince, IBM has not "changed" VSAM since the
release of OSVS/2 in 1974. VSAM replaced ISAM as an access technique.
ISAM is not a supported COBOL access technique in IBM COBOL (unless
the IBM is COBOL-D or COBOL-F - perhaps COBOL OSVS).
Different worlds I guess
Ralph
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
ISAM = VSAM in the IBM world.
IBM deprecated ISAM in 1972.
"The internal structures between version of a ISAM handler (or others) is
NOT guaranteed if at all, so you should have a way of back up the file
to a Seq or LS if possible before using the new version to copy it back.
Yes this means you have to run it before re-compiling and that I have
not found a way around."
The above kind of comment is not applicable to professional zOS application COBOL programmers. Your comments regarding ISAM are "out of this world" from a zOS professional COBOL programmer perspective.
Pity the zOS professional COBOL programmer that attempts to fathom a discussion regarding file access services - VBI vs BDB - huh?
As I previously stated, "different worlds".
Nothing more than a different experience basis.
Ralph
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Let me understand.
If a record being written is 100 bytes in length - it will physically occupy 4096 bytes ?
Huh?
This new paradigm is compatible with ?
Perhaps this is referencing the control interval size - not the record length of the record being written ?
Ralph
Last edit: Ralph Linkletter 2022-01-06
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
here's the files in question.
File named with 201 - size 352 bytes - 2 records key 6 bytes (1-6).
File named with 211 - size 4096 bytes- 2 records key 6 bytes (1-6).
The .idx file is identical in both cases - 12288 bytes - 3x4096 ???
I would say it's wasted space - but then I'm not a programmer am I Vince ??
well, if the record is 352 bytes and BDB does add about 37% so say record is around 482 bytes... 482 bytes is quit a bit less than 4096 bytes.
We have BDB of 142,671,636 records at 603 bytes. the raw sequential file is then 86,030,996,508bytes (81G), when loaded to BDB index file, the size is 117,921,181,696bytes (110G).
If we loaded the same records at 4096 bytes, the BDB index file would be around 800,000,000,000 bytes or about 781G.
unless I am missing something, I would think an indexed file should not add over 3000 bytes per record to make a file.
Last edit: Mickey White 2022-01-07
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Recently had the need to compare Isam files and found immense difficulty in comparing the files created by VBI211 & those of it's predecessor 201.
201 created files .dat & .idx with the .dat file having a size of the actual data recorded.
211 creates .dat files (on my 64bit pc) with a minimum size of 4096 irrespective of the data and padded out with nul 00's. which is a distinct PITA.
@sf-mensch - when you get round to making changes to VBI can you please pad the blocks out with space & not nul.
On 06/01/2022 09:57, David Wall wrote:
The internal structures between version of a ISAM handler (or others) is
NOT guaranteed if at all, so you should have a way of back up the file
to a Seq or LS if possible before using the new version to copy it back.
Yes this means you have to run it before re-compiling and that I have
not found a way around.
Almost all my s/w if not all, does this via a parameter to a program. If
for no other reason if I have to move platforms.
This could also apply changing the version of the Cobol compiler and I
do not EVER "ASSUME".
If it doubt, cover it (the possible issue) in code :)
This is 60+ years programming talking :(
Vince
Vince - whilst I appreciate the above - it's actually NOT what I'm kinda complaining about.
My only reason for using VBI over BDB is BECAUSE the VBI .DAT files are so easily read and displayed by mere text editors (because their index is a separate file).
Compared to the somewhat mish mash of BDB where their index is mangled up with the data.
However - VBI 201 seemed to only write the actual data to the .DAT file (NO extra padding) whereas VBI211 now writes the .DAT file out in either 1024 (32bit) or 4096 (64bit) chunks depending on which OS you have (32/64 bit) Win10 in my case.
AND - instead of writing Blanks (Hex20's) to pad the file - it's left as NULL (Hex 00's) which makes a right mess of trying to display the .DAT file on the screen.
I was just asking Simon (when he mods the VBI) to pad with spaces instead of nulls -for us dummies.
I realise not everyone needs to look at the .DAT files and compare the data with other files - but I do and it's not easy having all those nulls to wade thru. :)
And - I'm only up to 54 years of programming - so WAY behind you.!!
I guess you've tried the new version from Ron which creates C-ISAM compatible files by default (it can read and write both this one, which is used by MF and therefore useful as a "more portable format" and the "old" VBISAM format).
As long as you don't
OPEN OUTPUTthe file it will stay in the same format as before, V-ISAM will have the option to configure the "old" one as default, GC4 already has the option to explicit specify that the files should be in VBISAM format.Maybe you could go thru this discussion - because from memory I couldn't get VISAM to work: So I reverted back to 211.
https://sourceforge.net/p/gnucobol/discussion/cobol/thread/584b210bde/?limit=25#23fc
Ron doesn't do any testing in Windows - so I guess it's NOT his problem - IS IT ??.
ALSO - This discussion way back in Dec last year:
https://sourceforge.net/p/gnucobol/bugs/791/#4b16/9b77
My basic point is that a change of version for a ISAM handler can change
for format of the data sorted or where.
In this case it is true - the programmer should have allowed for that
but there again if non professional programmers are doing the work . . .
Vince
On 06/01/2022 15:59, David Wall wrote:
Closing up though :)
But then again David and Vince, IBM has not "changed" VSAM since the release of OSVS/2 in 1974. VSAM replaced ISAM as an access technique. ISAM is not a supported COBOL access technique in IBM COBOL (unless the IBM is COBOL-D or COBOL-F - perhaps COBOL OSVS).
Different worlds I guess
Ralph
What on earth has IBM got to do with GnuCobol and the supporting ISAM
structures.
ISAM = Indexed Sequential Access Method - this term is not original IBM
but goes back to the 60's and was in use by hmm lets try and remember -
ICT, ICL, English Electric, Burroughs, Honeywell CDC, may be Cray (can't
remember), Leo became EE then ICT et al. There are a lot of others but
at my age the grey cells are not working as well.
ISAM is a methodology and it is totally immaterial who coined it first
as every computer under the sun using a mass storage process that is
deemed as Fast using it one why or another - heck even my Mobile phone.
The method used by a piece of software used as a devise handler one why
or another and that includes BDB, VBisam etc has changes over its
development life even more so with many different programmers involved -
with or without adhering to what ever standard were in use when first
written if then !
Now one would thing that newer version of this type of s/w would be
backward compatible but I am afraid standard we are used to have dropped
assuming they were they in the beginning with some of them!.
I am saying that having looked at some of this code from C, C++ [just
about], Pascal, Basic, Cobol, Assembler, Macro Assembler, S3, RPG n,
Algol 68R to name but a few in my short life :) This was both as a
Programming Manager / Director and just as a home programmer looking at
old code and no I have no intention of doing the same with my old stuff
more than I have done with my ACAS system some of which code goes back
to around 1967.
.
Vince
On 06/01/2022 17:56, Ralph Linkletter wrote:
ISAM = VSAM in the IBM world.
IBM deprecated ISAM in 1972.
"The internal structures between version of a ISAM handler (or others) is
NOT guaranteed if at all, so you should have a way of back up the file
to a Seq or LS if possible before using the new version to copy it back.
Yes this means you have to run it before re-compiling and that I have
not found a way around."
The above kind of comment is not applicable to professional zOS application COBOL programmers. Your comments regarding ISAM are "out of this world" from a zOS professional COBOL programmer perspective.
Pity the zOS professional COBOL programmer that attempts to fathom a discussion regarding file access services - VBI vs BDB - huh?
As I previously stated, "different worlds".
Nothing more than a different experience basis.
Ralph
Let me understand.
If a record being written is 100 bytes in length - it will physically occupy 4096 bytes ?
Huh?
This new paradigm is compatible with ?
Perhaps this is referencing the control interval size - not the record length of the record being written ?
Ralph
Last edit: Ralph Linkletter 2022-01-06
I agree, if records are each bloated by an extra 4000 bytes, then that is a lot of wasted disk space.
here's the files in question.
File named with 201 - size 352 bytes - 2 records key 6 bytes (1-6).
File named with 211 - size 4096 bytes- 2 records key 6 bytes (1-6).
The .idx file is identical in both cases - 12288 bytes - 3x4096 ???
I would say it's wasted space - but then I'm not a programmer am I Vince ??
well, if the record is 352 bytes and BDB does add about 37% so say record is around 482 bytes... 482 bytes is quit a bit less than 4096 bytes.
We have BDB of 142,671,636 records at 603 bytes. the raw sequential file is then 86,030,996,508bytes (81G), when loaded to BDB index file, the size is 117,921,181,696bytes (110G).
If we loaded the same records at 4096 bytes, the BDB index file would be around 800,000,000,000 bytes or about 781G.
unless I am missing something, I would think an indexed file should not add over 3000 bytes per record to make a file.
Last edit: Mickey White 2022-01-07
It's NOT per record - it's just the default block size - depending on how many records you have the size taken is a multiple of 4096 bytes.
With 201 if you have 2 records of 80 bytes then the .dat file is 160 chars.
IF you have 52 records of 80 chars then the file is 4160 bytes.
With 211 if you have 2 records of 80 bytes then the .dat file is 4096 bytes.
IF you have 52 records of 80 chars then the file is 8192 bytes.
I try not to use BDB because the index is scrambled in with the data.