Menu

Identifying MicroFocus indexed files

GnuCOBOL
Boris Eng
2024-08-02
2024-08-09
  • Boris Eng

    Boris Eng - 2024-08-02

    Hello. I'm trying to identify the organization of some MicroFocus data files.

    Those files are associated with the following information:
    - indexed organization
    - sequential access
    - VSAM dataset organization
    - codeset ASCII
    - record format KS
    - it has duplicate alternate keys

    There is a single file so no separate .idx file. I assume it contains information of both data and indexes. Here is the result of hexdump -n 100:

    0000000 fe33 0000 0000 0000 3232 3130 3430 3431
    0000010 3230 3832 3139 3232 3130 3430 3431 3230
    0000020 3932 3731 3e00 0200 0000 0801 0000 0000
    0000030 0000 0006 0000 0000 c800 0000 c800 0000
    0000040 0000 0000 0000 6c00 0000 0000 0203 0001
    0000050 0000 0000 0000 0000 0000 0000 0000 0000
    0000060 0000 0000
    0000064
    

    I assume it is an IDXFORMAT unsupported by GnuCOBOL. I tried to read it with GC3, GC4, BDB, V(B)ISAM, MF file format and I also tried to manually and naively separate the file into a .dat and an .idx file but nothing works.
    According to this:
    https://www.microfocus.com/documentation/visual-cobol/vc60/DevHub/HRFLRHFILE0B.html
    https://www.microfocus.com/documentation/server-express/sx20books/fhfile.htm
    It should be IDXFORMAT(8) since there's no separate index file but the structure of the header doesn't seem to match.

    Does anyone has an idea of what kind of file organization it is? Thank you!

     
    • Mickey White

      Mickey White - 2024-08-02

      Well I think if there is only one .idx file, then that, to me, means there are no secondary indexes. Have you tried the file command ?

      mickeyw@mickeyw-Meerkat:~/Xtra/Data$ file file2.dat
      file2.dat: Berkeley DB (Btree, version 9, native byte-order)
      mickeyw@mickeyw-Meerkat:~/Xtra/Data$
      
       
      • Boris Eng

        Boris Eng - 2024-08-02

        It just says "data". I believe the file contains both data and indexes (as for BDB files except that it's not a BDB file).

         
        • Ralph Linkletter

          If possible send me the file.
          Redacted if you can.
          A subset if you can.
          Zipped if you can.
          Do you have Micro Focus Enterprise Developer or MFE or any other MF COBOL incarnation.

          I can run your file thru some of my MF tools.
          The hex dump of the file is not sufficient for me.

          GnuCOBOL, via C-ISAM, only supports MF IDXFORMAT"1"

          If it would help I can use IDCAMS to REPRO the ISAM dataset to a sequential file.

           
          😄
          1

          Last edit: Ralph Linkletter 2024-08-02
    • Vincent (Bryan) Coen

      Do you have a copy of the program that created this file ?
      Do you have the record layout ?

      If you do the simplest and primary way is to create a sequential file using the ISAM file the ina very small PD do :
      Open Input ISAM-FILE Output Seq-file.
      perform until exit
      read ISAM-FILE at end exit perform
      write Seq-Record from ISAM-Record
      exit perform cycle
      end-perform
      close ISAM-File SEQ-File.
      goback.

      Then copy the seq file to GC and read it in what ever way you need.

      The normal method is above for migrating ANY ISAM type file to another platform.

       

      Last edit: Vincent (Bryan) Coen 2024-08-06
      • Ralph Linkletter

        Boris does not have any Micro Focus (MF) products.
        The file he is using is an ISAM / VSAM file from the MF bank demo suite.
        H e downloaded that file and is trying to read it with GnuCOBOL.
        He did not download the MF components,

        It is a MF IDXFORMAT"8" ISAM / VSAM file.

        Your suggestion would work if he had a license to the MF product.
        He does not. Reading an MF IDXFORMAT"8" file will not work stand alone with GnuCOBOL.

         
  • appletonc

    appletonc - 2024-08-04

    Did you check the file header to make sure that it is in fact a MF indexed file.

    Index File - File Header
    - The first 4 bits are always set to 3 (0011 in binary) ...
    - Don't forget about endianness when dealing with binary data

    I am not aware of any tools available which can read or dump the structure for MF IDXFORMAT"8".

    You could write your own tool to dump the structure (keys etc.), but that is not a trivial task.

    Consequently, the best bet is to write a COBOL program, compiled with a MF compiler, and output the data in sequential format.

     
  • Boris Eng

    Boris Eng - 2024-08-05

    According to Ralph's analysis, it looks like it was IDXFORMAT"8". But it doesn't seem to match the information I have about structures. I might be missing something.

    You could write your own tool to dump the structure (keys etc.), but that is not a trivial task.

    Aren't file structures known already? So I guess it should not be too difficult? I may be wrong.

    Thank you for your answers anyway!

     
    • Ralph Linkletter

      It is IDXFORMAT"8"
      What is it that you are attempting to do ?
      What documentation are you looking at that defines the internal composition of a MF IDXFORMAT"8" ISAM / VSAM file ?

      If I recall properly, records can contain data and index references.
      There is no separate area of the data file dedicated only to the indices.

      You could decode the first 128 bytes of the record to determine some of the file characteristics. Each recent MF VSAM file has a 128 byte header. Beyond the 128 header are data and index pointers.

      Albeit you can probably discern "the file structure" to read the records. I doubt that you could support the integrity of the file to implement write, rewrite, delete support.

      Somewhere within the MF paradigm is a process akin to an IBM control interval splitting.

      The record length in the example is 200 bytes - but the MF I-O system presents the 200 bytes to the application program - the actual physical file is a collection of VSAM leaf nodes and a double linked list.

      The attachment is a zone over decimal hex dump of parts of the file. Notice the mix of dtaa records and index entries.

       

      Last edit: Ralph Linkletter 2024-08-05
  • Chuck H.

    Chuck H. - 2024-08-05

    Boris,

    do you have access to MF Visual COBOL ?

    if so the REBUILD utility might be able to convert the files to C-ISAM after which you may be able to access it with GNUCOBOL using VISAM.

     
    • Boris Eng

      Boris Eng - 2024-08-06

      I currently don't but I'll think about it. Thank you!

       
      • Ralph Linkletter

        Removed defect detected

         

        Last edit: Ralph Linkletter 2024-08-10
        • Vincent (Bryan) Coen

          On 09/08/2024 21:37, Ralph Linkletter wrote:

          This is a GnuCOBOL program BUT it needs CBL_ routines written by Chuck H.

          CBL_CHECK_FILE_EXISTS

          CBL_CHECK_FILE_EXIST
          exists already and has for some time.

          CBL_SPLIT_FILENAME

          Not really needed as there is Cobol verbs that do similar. i.e.,
          UNSTRING - see PG or PR for usage if needed.

           
          • Ralph Linkletter

            See the MF or Fujitsu PG for calling syntax.
            Both COBOL compilers implemented CBL_SPLIT_ FILE and CBL_JOIN_FILE as ease of use subroutines.

            Granted Unstring could be used - but why ? when there is a specific service available to do so.

            The CBL_SPLIT parameter list:

                       05 SPLIT-JOIN-PARAM.
                           10  PARAM-LENGTH PIC 9(4) COMP-4 VALUE 24.
                           10  SPLITJOIN-FLG1 PIC 9(4)  COMP-4 VALUE 0.
                           10  SPLITJOIN-FLG2 PIC 9(4)  COMP-4.
                           10  XPATH-STRT PIC 9(4) COMP-4.
                           10  XPATH-LEN PIC 9(4) COMP-4.
                           10  BASE-STRT PIC 9(4) COMP-4.
                           10  BASE-LEN PIC 9(4) COMP-4.
                           10  EXT-STRT PIC 9(4) COMP-4.
                           10  EXT-LEN PIC 9(4) COMP-4.
                           10  TOTAL-LENGTH PIC 9(4) COMP-4.
                           10  SPLIT-BUF-LEN PIC 9(4) COMP-4    VALUE 100.
                           10  JOIN-BUF-LEN PIC 9(4) COMP-4     VALUE 100.
                           10  FIRST-PATH-LEN PIC 9(4) COMP-4.
            

            Chuck implemented CBL_SPLIT_FILE for GnuCOBOL.
            Probably hiding somewhere in "contributions"
            Thanks for the feedback

             

            Last edit: Ralph Linkletter 2024-08-09

Log in to post a comment.