[Cobolforgcc-devel] Re: Cobol for GCC
Status: Pre-Alpha
Brought to you by:
timjosling
From: Tim J. <te...@me...> - 2001-09-10 20:28:34
|
Rama, Sorry for the delay in replying. I am getting my house ready for selling at auction at the moment. In the discussion below, all the file names assume that they start with "cobr_". This temp.c is really cobr_temp.c. The file cobr_sort_overview.txt describes the overall structure. At the moment only routines 4 - basic in core sort and 6 - compare have been completed. The routine 7 - sort IO routines have been written but not tested. These sort IO routines are meant to be for handling IO to sort work files if the sort is too large to fit in memory. These routines are not for doing the IO to the actual input and output files specified by the programmer. The way large sorts work is that they take the input a chunk at a time and sort each chunk in memory. If all the input fits in memory, then there is only one chunk and we are done. If it does not fit in memory, then the chunks need to be written out to disk using routine 7 - IO, and then merged. The merge works by reading in several chunks (maybe up to 10) a record at a time and merging them into one output chunk which is written out as the merge proceeds. So each merge pass reduces the number of chunks by a factor of, say, 10. Eventually there is only one chunk and you are done. In COBOL the input to a sort can be a file or files, or an input procedure. Either way, the compiler will generate the code to read the file and will pass the records one at a time to the sort/merge executive (routine 2) using a routine called something like 'sort_put_record'. Once the sort is done the sort/merge executive would hand the records back to the compiler one at a time. Presumably this would be done by the compiler calling a routine called something like 'sort_get_record'. The compiler would either call the output procedure or write the records to a file, depending on what the programmer asked for in the code. For a merge, the input files are assumed to be in order. The compiler would have to pass information to the sort/merge executive to specify things like maximum memory to be used, where to put the work files, maximum size of the work files, and the details of the sort fields and the collating sequence (see below). This interface would be similar to the interface to sort.c I hope this clarifies things; if not please ask some more questions. See also below... I have cc'd this to cobolforgcc-devel, to keep a record of this. I hope you don't mind. Regards, Tim Josling "Linga, Rama Krishna (Rama)" wrote: > Hi Tim. > > I could not understand afew things regarding this sort/merge. Quite understandable. > > > > 1. What is the prime objective of this? Is it to write an > equivalant code for converting SORT - MERGE usages of COBOL in C. Then what > are all these collating sequences and how many of them are primarily related > to this code and in what way? The main aim is, as you said, to support the sort/merge verbs of COBOL. The collating sequences are used in the compare routine. In COBOL you can specify a collating sequence, which means characters are compared using the collating sequence rather than using the binary values of the characters. See cobr_compare.[ch]. Effectively the characters are converted using the lookup table (collating sequence) before being comparesd. > > > 2. And what are we sorting? Data files / text files and what > are the format of these files? > The intention is to support both text files (delimited by \n) and non-text files. Non-text files can be either fixed length or variable length, with a record control word at the start, giving the length.. However at the moment none of the code to support the various file formats has been written, just some of the core sort routines have been written. > > 3. And how do we use these formats for sorting. Like, how do we > know about the field we are going to use for sorting. > The overview.txt file gives the suggested module structure. Ted has written 4 - basic in core sort (sort.c) and 6 - compare function (compare.c) and had started 7 - sort IO (sort_io.c) but I don't think 7 was complete. The sort.c routine is passed the structure of the fields in the sort_init call in the parameter sort_fields. I assume that the compiler generated code would pass similar information to the sort-merge executive. The compiler will implement routine (1). This would pass the details of the fields to the sort/merge executive (not yet written) which would then call the sort/merge and IO routines. > > 4. When will be the compiler generated code uses the run time > interfaces of sort and merge? The compiler generated code will call routine 3 (sort/merge executive). The interface for this has not been specified. > > > 5. When is the command levels are used? The command level (routine 2) would be a stand alone utility, to be written later on, using the sort/merge code. > > > 6. What exactly is status of this sort/merge? I looked into > cobr_sort_readme.txt but that is so vague. I could not get much out of it. If you look at the overview.txt, the routines 4 and 6 have been done, and part of 7 (as described above). See also below. I tend to think that sort.c (4) and compare.c (6) can be kept, but maybe sort_io.c (7) could be redone. If I were doing this, I would probably do the merge routine next, then the work file IO routines and buffer management routines (routine 7). However it is up to you whether you want to use all of part of Ted's code. It may be you would find it easier to start again, than try to dissect his code. > > 7. What about merge and sort-merge routines. The current stuff > appears like just sort related. No merge code has been written yet. > > > Before start writing the code, I would like to know these things. > > > Regards. > rama |