A general issue modernizing cobol environments is to tranform the set of indexed files of the application into a relational database.
It is not a trivial issue. There are a lot of proprietary solutions for that.
Do you know projects or people that worked on this problem in an Open Source philosophy?
With gnucobol we have the semantics of files descriptions and access analyzed, which is a milestone. Perhaps someone thought of using them , and apply transformations on them?
Every information, question, advice will be wellcomed.
Gérard Calliet
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
GnuCOBOL 4.0-early-dev provides an ODBC/OCI handler - this is a build-in feature - which allows you to "use COBOL indexed access" and actually access the database (it even allows for a one-time conversion of the files or a step-by-step conversion as you can include multiple file handlers there and choose the one used "per file"). Combining that with PostgreSQL looks like a good free software statck to me.
To get the most performance out of SQL you'd likely convert at least part of the COBOL sources to use EXEC SQL and there are some free implementations for that, too (for example esqlOC, found in the contributions, that also uses ODBC/OCI).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2020-10-12
Very interesting. Where can I get documentation on that?
My goal is to transform at a safe pace a complete set of indexed filed to a relational database.
And once I have got the relational database, I could access it both from classic indexed file calls and from SQL interfaces, from Cobol or other langages.
My project is based on an OpenVMS platform. I think I'll have first to port gnuCobol on OpenVMS, and after that merge the parts in native OpenVMS Cobol and ported gnuCobol. And I'll use all the tools I found.
What do you think about the Berkeley DB and attempting to promote from it a relational DB vue?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sounds like a reasonable approach and an interesting project.
I'm not sure if that is the best one as you'd have to build GnuCOBOL and all dependencies (at least BerkeleyDB or VBISAM) on OpenVMS (assuming you have the necessary tools installed/licensed).
It is likely much easier to unload all indexed files to plain sequential files on the OpenVMS, then copy these along with the COBOL sources over to a "more open" platform (GNU/Linux is available on nearly all hardware platforms and provides a rich build environment) where you've installed GnuCOBOL. Then load all the sequential files to one of the available indexed handlers, and especially compile all your COBOL sources with it (possibly change some of the defaults to cater for OpenVMS compatibility).
Once you are sure that GnuCOBOL works as expected you can then move the indexed handlers, possibly one by one, to use ODBC (which of course needs to work outside of GnuCOBOL first).
I second Simon on moving the files out of OpenVMS.
But if you can't because everything must be on the same machine/cluster (IA64, Alpha or VAX?), on top of porting GnuCOBOL depedencies (and GNV seems to be stalled), you will have to check the dialect of C and implement or make Simon and al. implement native COBOL constructs like conditional "status is success".
Not an easy job but feasible.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My first goal is being able to transform the set of idexed files to a set a table in a rdbms. And being able to create in the legacy program reasonable interfaces.
Perhaps I could use a linux workshop for that, but clearly the result will be on OpenVMS plus perhaps an Postgres rdbms on the network.
In the mean time it could be good to have a complete gnucobol workshop on OpenVMS, but it is not the first goal.
And no GNV is not stalled. It is a little bit asleeped, but not dead :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
have to check the dialect of C and [...] implement native COBOL constructs like conditional "status is success"
As GnuCOBOL still uses ANSI C89 (with some extensions, often checked and only used depending on values of config.h) the C part should not be the biggest issue. GnuCOBOL 4x was not tested on many exotic systems yet and include a bunch of new code (mostly fileio related) so there may be some things to tweak, but it should be doable.
@bgiroud Do you know of COBOL dialect issues that are special and may not be available with GnuCOBOL yet?
Could you possibly come up with an openvms.conf (COBOL dialect configuration)?
Out of interest: do you have access to COBOL on OpenVMS (for comparision)?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
@sf-mensch I'm pretty sure that CALL xxx BY DESCRIPTOR, for example, and conditionals testing SUCCESS or FAILURE aren't (yet) in GnuCOBOL.
Probably no, for two reasons : not enough time and...
No more access on OpenVMS (from 2005) . But if that might help in googling "VSI OpenVMS VSI COBOL Reference Guide", you can download the future (or current?) reference manual.
And Gérard Calliet will be the ideal man given his profile.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes we have By Descriptor, succes and failure testing with OpenVMS.
A complete port could be a little complex. But I don't think doing so in the beginning.
I think I will use gnuCobol in a first time isolating routines which will do the access, routines I'll elaborate first in a linux environment.
For a long time I'll use an heteronomous workshop. Later converging.
I know, big challenge.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm not sure I understand clearly what you said:
"""GnuCOBOL 4.0-early-dev provides an ODBC/OCI handler - this is a build-in feature - which allows you to "use COBOL indexed access" and actually access the database (it even allows for a one-time conversion of the files or a step-by-step conversion as you can include multiple file handlers there and choose the one used "per file")."""
Do you mean you can go on using the indexed file syntax and semantic, and something underground makes it as ODBC requests?
And the conversion tool is somewhere in the package?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Do you mean you can go on using the indexed file syntax and semantic, and something underground makes it as ODBC requests?
Yes. Either you configure GnuCOBOL (4+) to use ODBC as default file handler or keep another one (like BDB or VBISAM which is in most cases preferable but I did not heard about someone using that on OpenVMS yet - patches to VBISAM are welcomed) and then specify some/all files to use the ODBC interface later.
And the conversion tool is somewhere in the package?
No conversion tool needed.
cobc generates a file definition that libcob uses to know how to map this to the database - but actually this is only needed if you want to use database types (otherwise libcob will use plain char[] and numeric types, but using that mapping which can also be manually constructed you can even use datetime and friends in the db [if there's a supported conversion]).
To allow libcob to access the database you'd have to setup ODBC (GnuCOBOL will use unixODBC by default for ODBC, which is then another dependency for your GnuCOBOL installation; but according to a very quick glance "some guy on some site says unixODBC 2.2.13 builds on OpenVMS").
A "conversion tool" would be simply a COBOL program reading through the indexed file sequentially and write to another file that looks to the COBOL program to be indexed but is actually configured to be handled by the GnuCOBOL ODBC handler.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2020-10-12
Aye. ISAM versus RDB performance is ugly. ISAM wins, no contest.
Add a couple of alternate indices to the ISAM definition - foreign key integrity becomes a challenge.
Normalization - any ?
I agree - unload to a sequential file.
Analysis required to convert any ISAM file defintion arrays to an acceptable SQL construct (YTD-DATA occurs 1 to 31 times depending on xxx.)
Reformat date fields to SQL format
Transform comp-3 data fields
Transform comp data fields (if needed)
Cleanse the data
Normalize - at least to first form
Create an RDB defintion(s)
Load the RDB
Simple ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As long as it is not part of a REDEFINE GnuCOBOL will handle binary/comp conversions on the fly (to the database numeric type) and also handles alternate keys as secondary indexes. If you add foreign key constructs in the DB and don't check before a WRITE/REWRITE/DELETE on the COBOL site you may get some nice error (at least a non 0x status), but I think this is not the thing the conversion discussed here is about (it was mainly about "accessing" the data outside of COBOL).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You are totally wright. I had all these alarms.
You are talking of a (sort of) canonical approach. My evaluation now is: is there a possiblility and interest doing an incremantal evolution. I think you would say: no hope.
I didn't make my mind today.
However I agree also on that it could not be a total automatic approach. I had always thought automatic approaches on modernization cannot be successfull. But semi-automatics one have their chances in some cases. The use case will lead me.
Last edit: Gérard Calliet 2020-10-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
With gnucobol we have the semantics of files descriptions and access analyzed, which is a milestone. Perhaps someone thought of using them , and apply transformations on them?
Most of the advice you're received focusses on technology. When I read your question, what technology to use didn't cross my mind. My immediate thought was that no automated process exists to transform your indexed files into a relational database. The required information is not present in the files and Cobol source code.
Back when SQL was young, many articles were published describing how to design SQL tables from file structures by, say, separating repeated groups into different tables. The process isn't particularly difficult for someone trained in the relational model, although SQL gives the uninitiated plenty of rope to hang themselves (and many do).
Once the database is designed, the application should be, too. The DBMS is more powerful than indexed files are; much of the work previously done in Cobol can be better, and faster, done by the DBMS. The alternative -- treating the SQL database like a file -- will perform badly and disappoint everyone.
My goal is to transform at a safe pace a complete set of indexed filed to a relational database.
And once I have got the relational database, I could access it both from classic indexed file calls and from SQL interfaces, from Cobol or other langages.
You may have heard, No battle plan survives first contact with the enemy.
Before you get very far along in that process, you will discover your plan is infeasible. Because the data models are completely different, your SQL tables, if correctly designed, will probably look very different from your files. As I said above, your application will also need to be changed to take advantage of the DBMS.
Much, much simpler is
Design the database, accounting for all fields in the file definition.
Write a 1-way transformation to load the database from the files. Plan to run that process many times, possibly in production.
Write the SQL the application will use to access the database.
Adapt the Cobol to use the SQL.
Test
Profit!
The easiest test-and-migration process is supported by a bi-directional transaction bridge. The new and old systems exist side by side, with identical data in the files and database. Some users use each system, and as changes are made on one side, they are automatically applied to the other. A process like that helps users learn the new system and find bugs using the interface they know. It avoids a "flag day" when everyone has to switch over all at once, replacing it with a "vote with your feet" approach, where the new system is more desirable, and the old one dies on the vine.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You are totally wright. I had all these alarms.
You are talking of a (sort of) canonical approach. My evaluation now is: is there a possiblility and interest doing an incremantal evolution. I think you would say: no hope.
I didn't make my mind today.
However I agree also on that it could not be a total automatic approach. I had always thought automatic approaches on modernization cannot be successfull. But semi-automatics one have their chances in some cases. The use case will lead me.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You are totally wright. I had all these alarms.
You are talking of a (sort of) canonical approach. My evaluation now
is: is there a possiblility and interest doing an incremantal
evolution. I think you would say: no hope.
I didn't make my mind today.
However I agree also on that it could not be a total automatic
approach. I had always thought automatic approaches on modernization
cannot be successfull. But semi-automatics one have their chances in
some cases. The use case will lead me.
I used the toolkit to convert over my ACAS O/S accounting system to use
RDB initially MS SQL then moved over to MySQL as my version only did for
MS but later versions did support other RDB systems.
You do have to normalise to the 4th level first all ISAM files. If you
are not familiar with this look in the details from the website.
Vince
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2020-10-13
https://primacomputing.co.nz/PRIMAMetro/RDBandSQL.aspx
Web site is not secure. Purportedly contains a trojan.
Secure Connection Failed
An error occurred during a connection to primacomputing.co.nz. SSL received a record that exceeded the maximum permissible length.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The SSL part is definitely fine, if you have issues with that it is likely either a broken SSL-in-place-change (some interent security suites have that mis-feature) or the browser/OS is too old.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2020-10-14
As Simon says - there is no problem just some browsers cannot fully cope with the extra functionality in but work OK with firefox.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
James Lowden is correct. I have work for a company that did this. You always carry the primary key of parent file in the child file. This gives you foreign keys in the child file to stop you having orphan rows. You can open cursers and gives you the ability to implement cascading deletes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
BTW the further down you go eg parent to child then child you need to carry the parent key, the key of the first child into the second child row.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2020-10-20
Seems like an IBM IMS data base - a hierarchical implementation (whether by key or RBA [pointer] the hierarchy is be perpetuated).
But a relational data base has no context without a language.
Yet this method provides context.
Goes around - comes around I gather.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
A general issue modernizing cobol environments is to tranform the set of indexed files of the application into a relational database.
It is not a trivial issue. There are a lot of proprietary solutions for that.
Do you know projects or people that worked on this problem in an Open Source philosophy?
With gnucobol we have the semantics of files descriptions and access analyzed, which is a milestone. Perhaps someone thought of using them , and apply transformations on them?
Every information, question, advice will be wellcomed.
Gérard Calliet
GnuCOBOL 4.0-early-dev provides an ODBC/OCI handler - this is a build-in feature - which allows you to "use COBOL indexed access" and actually access the database (it even allows for a one-time conversion of the files or a step-by-step conversion as you can include multiple file handlers there and choose the one used "per file"). Combining that with PostgreSQL looks like a good free software statck to me.
To get the most performance out of SQL you'd likely convert at least part of the COBOL sources to use
EXEC SQLand there are some free implementations for that, too (for example esqlOC, found in the contributions, that also uses ODBC/OCI).Very interesting. Where can I get documentation on that?
My goal is to transform at a safe pace a complete set of indexed filed to a relational database.
And once I have got the relational database, I could access it both from classic indexed file calls and from SQL interfaces, from Cobol or other langages.
My project is based on an OpenVMS platform. I think I'll have first to port gnuCobol on OpenVMS, and after that merge the parts in native OpenVMS Cobol and ported gnuCobol. And I'll use all the tools I found.
What do you think about the Berkeley DB and attempting to promote from it a relational DB vue?
Sounds like a reasonable approach and an interesting project.
I'm not sure if that is the best one as you'd have to build GnuCOBOL and all dependencies (at least BerkeleyDB or VBISAM) on OpenVMS (assuming you have the necessary tools installed/licensed).
It is likely much easier to unload all indexed files to plain sequential files on the OpenVMS, then copy these along with the COBOL sources over to a "more open" platform (GNU/Linux is available on nearly all hardware platforms and provides a rich build environment) where you've installed GnuCOBOL. Then load all the sequential files to one of the available indexed handlers, and especially compile all your COBOL sources with it (possibly change some of the defaults to cater for OpenVMS compatibility).
Once you are sure that GnuCOBOL works as expected you can then move the indexed handlers, possibly one by one, to use ODBC (which of course needs to work outside of GnuCOBOL first).
Concerning how to setip GnuCOBOL and ESQL you'll find some documentation in the FAQ and extended at https://gitlab.cobolworx.com/gnucobol/sql .
In any case I suggest to register/login to remove the additional moderation need of your posts.
Last edit: Simon Sobisch 2020-10-12
I second Simon on moving the files out of OpenVMS.
But if you can't because everything must be on the same machine/cluster (IA64, Alpha or VAX?), on top of porting GnuCOBOL depedencies (and GNV seems to be stalled), you will have to check the dialect of C and implement or make Simon and al. implement native COBOL constructs like conditional "status is success".
Not an easy job but feasible.
My first goal is being able to transform the set of idexed files to a set a table in a rdbms. And being able to create in the legacy program reasonable interfaces.
Perhaps I could use a linux workshop for that, but clearly the result will be on OpenVMS plus perhaps an Postgres rdbms on the network.
In the mean time it could be good to have a complete gnucobol workshop on OpenVMS, but it is not the first goal.
And no GNV is not stalled. It is a little bit asleeped, but not dead :)
As GnuCOBOL still uses ANSI C89 (with some extensions, often checked and only used depending on values of config.h) the C part should not be the biggest issue. GnuCOBOL 4x was not tested on many exotic systems yet and include a bunch of new code (mostly fileio related) so there may be some things to tweak, but it should be doable.
@bgiroud Do you know of COBOL dialect issues that are special and may not be available with GnuCOBOL yet?
Could you possibly come up with an openvms.conf (COBOL dialect configuration)?
Out of interest: do you have access to COBOL on OpenVMS (for comparision)?
@sf-mensch I'm pretty sure that CALL xxx BY DESCRIPTOR, for example, and conditionals testing SUCCESS or FAILURE aren't (yet) in GnuCOBOL.
Probably no, for two reasons : not enough time and...
No more access on OpenVMS (from 2005) . But if that might help in googling "VSI OpenVMS VSI COBOL Reference Guide", you can download the future (or current?) reference manual.
And Gérard Calliet will be the ideal man given his profile.
Yes we have By Descriptor, succes and failure testing with OpenVMS.
A complete port could be a little complex. But I don't think doing so in the beginning.
I think I will use gnuCobol in a first time isolating routines which will do the access, routines I'll elaborate first in a linux environment.
For a long time I'll use an heteronomous workshop. Later converging.
I know, big challenge.
I'm not sure I understand clearly what you said:
"""GnuCOBOL 4.0-early-dev provides an ODBC/OCI handler - this is a build-in feature - which allows you to "use COBOL indexed access" and actually access the database (it even allows for a one-time conversion of the files or a step-by-step conversion as you can include multiple file handlers there and choose the one used "per file")."""
Do you mean you can go on using the indexed file syntax and semantic, and something underground makes it as ODBC requests?
And the conversion tool is somewhere in the package?
Yes. Either you configure GnuCOBOL (4+) to use ODBC as default file handler or keep another one (like BDB or VBISAM which is in most cases preferable but I did not heard about someone using that on OpenVMS yet - patches to VBISAM are welcomed) and then specify some/all files to use the ODBC interface later.
No conversion tool needed.
cobc generates a file definition that libcob uses to know how to map this to the database - but actually this is only needed if you want to use database types (otherwise libcob will use plain char[] and numeric types, but using that mapping which can also be manually constructed you can even use datetime and friends in the db [if there's a supported conversion]).
To allow libcob to access the database you'd have to setup ODBC (GnuCOBOL will use unixODBC by default for ODBC, which is then another dependency for your GnuCOBOL installation; but according to a very quick glance "some guy on some site says unixODBC 2.2.13 builds on OpenVMS").
A "conversion tool" would be simply a COBOL program reading through the indexed file sequentially and write to another file that looks to the COBOL program to be indexed but is actually configured to be handled by the GnuCOBOL ODBC handler.
Aye. ISAM versus RDB performance is ugly. ISAM wins, no contest.
Add a couple of alternate indices to the ISAM definition - foreign key integrity becomes a challenge.
Normalization - any ?
I agree - unload to a sequential file.
Analysis required to convert any ISAM file defintion arrays to an acceptable SQL construct (YTD-DATA occurs 1 to 31 times depending on xxx.)
Reformat date fields to SQL format
Transform comp-3 data fields
Transform comp data fields (if needed)
Cleanse the data
Normalize - at least to first form
Create an RDB defintion(s)
Load the RDB
Simple ?
As long as it is not part of a
REDEFINEGnuCOBOL will handle binary/comp conversions on the fly (to the database numeric type) and also handles alternate keys as secondary indexes. If you add foreign key constructs in the DB and don't check before aWRITE/REWRITE/DELETEon the COBOL site you may get some nice error (at least a non 0x status), but I think this is not the thing the conversion discussed here is about (it was mainly about "accessing" the data outside of COBOL).(sorry I was answerring James)
You are totally wright. I had all these alarms.
You are talking of a (sort of) canonical approach. My evaluation now is: is there a possiblility and interest doing an incremantal evolution. I think you would say: no hope.
I didn't make my mind today.
However I agree also on that it could not be a total automatic approach. I had always thought automatic approaches on modernization cannot be successfull. But semi-automatics one have their chances in some cases. The use case will lead me.
Last edit: Gérard Calliet 2020-10-13
Most of the advice you're received focusses on technology. When I read your question, what technology to use didn't cross my mind. My immediate thought was that no automated process exists to transform your indexed files into a relational database. The required information is not present in the files and Cobol source code.
Back when SQL was young, many articles were published describing how to design SQL tables from file structures by, say, separating repeated groups into different tables. The process isn't particularly difficult for someone trained in the relational model, although SQL gives the uninitiated plenty of rope to hang themselves (and many do).
Once the database is designed, the application should be, too. The DBMS is more powerful than indexed files are; much of the work previously done in Cobol can be better, and faster, done by the DBMS. The alternative -- treating the SQL database like a file -- will perform badly and disappoint everyone.
You may have heard, No battle plan survives first contact with the enemy.
Before you get very far along in that process, you will discover your plan is infeasible. Because the data models are completely different, your SQL tables, if correctly designed, will probably look very different from your files. As I said above, your application will also need to be changed to take advantage of the DBMS.
Much, much simpler is
The easiest test-and-migration process is supported by a bi-directional transaction bridge. The new and old systems exist side by side, with identical data in the files and database. Some users use each system, and as changes are made on one side, they are automatically applied to the other. A process like that helps users learn the new system and find bugs using the interface they know. It avoids a "flag day" when everyone has to switch over all at once, replacing it with a "vote with your feet" approach, where the new system is more desirable, and the old one dies on the vine.
You are totally wright. I had all these alarms.
You are talking of a (sort of) canonical approach. My evaluation now is: is there a possiblility and interest doing an incremantal evolution. I think you would say: no hope.
I didn't make my mind today.
However I agree also on that it could not be a total automatic approach. I had always thought automatic approaches on modernization cannot be successfull. But semi-automatics one have their chances in some cases. The use case will lead me.
On 13/10/2020 16:11, "Gérard Calliet" wrote:
Take a look at :
https://primacomputing.co.nz/PRIMAMetro/RDBandSQL.aspx
I used the toolkit to convert over my ACAS O/S accounting system to use
RDB initially MS SQL then moved over to MySQL as my version only did for
MS but later versions did support other RDB systems.
You do have to normalise to the 4th level first all ISAM files. If you
are not familiar with this look in the details from the website.
Vince
https://primacomputing.co.nz/PRIMAMetro/RDBandSQL.aspx
Web site is not secure. Purportedly contains a trojan.
Secure Connection Failed
An error occurred during a connection to primacomputing.co.nz. SSL received a record that exceeded the maximum permissible length.
There's no issue reported here, virustotal says yandex sees an issue and all other engines don't.
The SSL part is definitely fine, if you have issues with that it is likely either a broken SSL-in-place-change (some interent security suites have that mis-feature) or the browser/OS is too old.
As Simon says - there is no problem just some browsers cannot fully cope with the extra functionality in but work OK with firefox.
James Lowden is correct. I have work for a company that did this. You always carry the primary key of parent file in the child file. This gives you foreign keys in the child file to stop you having orphan rows. You can open cursers and gives you the ability to implement cascading deletes.
BTW the further down you go eg parent to child then child you need to carry the parent key, the key of the first child into the second child row.
Seems like an IBM IMS data base - a hierarchical implementation (whether by key or RBA [pointer] the hierarchy is be perpetuated).
But a relational data base has no context without a language.
Yet this method provides context.
Goes around - comes around I gather.