GnuCOBOL / Discussion / Help getting started: Issue with GCSORT using temporary files and fixed-length records

Leonardo Barbieri - 2025-06-16

Hi,
first of all, thank you again for your work on GCSORT.

We are currently experiencing an unexpected behavior while using GCSORT in a relatively simple case involving the SUM FIELDS operation to count records, particularly when temporary files are used. The issue appears even with just two temporary files, but it does not occur when the record count remains within the configured memory limits.

Context
GCSORT version: 01.04.05
OS: Linux RedHat 8
Execution mode: shell script attached (lines 23 and 24 must be updated to point to the correct absolute path of GCSORT libraries and binary)

Description of the issue
When GCSORT exceeds the available memory and starts using temporary files, the output file contains truncated records: the fixed-length 80-character record is interrupted by a 0A (line feed) and followed by corrupted or garbage bytes, as shown in this hex dump:

00000000: 3033 3236 3035 3935 2020 2020 2020 2020 03260595 00000010: 2020 2020 2020 2020 2020 2020 2020 2020 00000020: 2020 2020 2020 2020 2020 2020 2020 2020 00000030: 2020 2020 2020 2020 2020 2020 2020 2020 00000040: 0a66 79b3 0f00 0000 0051 0000 0030 3030 .fy......Q...000 00000050: 30 0

If we increase GCSORT_MEMSIZE, pushing the use of temporary files further away, the issue disappears, but it reappears as soon as memory is again exceeded and temporary files are used.

Test case objective
We generate an SQ file (ListRecordDaContare.list) containing the string 00000001 repeated over 3 million times.
The sort card used by GCSORT is:

USE ./ListRecordDaContare.list ORG SQ RECORD F,8 GIVE ./NumeroRecordContati.txt ORG SQ RECORD F,81 INREC FIELDS=(1,8,72X,X'0A') SORT FIELDS=(1,8,CH,A) SUM FIELDS=(1,8,ZD)

We expect a single output record containing the count (e.g., 03260595) in the first 8 characters.

Expected result
Correct output (when no temporary files are used):

00000000: 3033 3236 3035 3935 2020 2020 2020 2020 03260595 00000010: 2020 2020 2020 2020 2020 2020 2020 2020 00000020: 2020 2020 2020 2020 2020 2020 2020 2020 00000030: 2020 2020 2020 2020 2020 2020 2020 2020 00000040: 2020 2020 2020 2020 2020 2020 2020 2020 00000050: 0a

Observed issues
* Corrupted output when temporary files are used.
* GCSORT exits without any error, even when the output is invalid.

Questions
1. Are there any internal guidelines or implementation notes explaining how GCSORT manages temporary files? It would be helpful to understand the flow to evaluate potential edge cases or contribute with targeted patches.
2. Is the silent behavior (no error) in case of corrupted output expected, or could it be improved with diagnostic messages?
3. Do you have any suggestions for workarounds other than increasing GCSORT_MEMSIZE?

I’m attaching the shell script that reproduces the issue. Please update lines 23 and 24 with the correct paths to the GCSORT libraries and binary before running it. And cyou can run with this command :

./TestSortCase.sh --NumRecord=3260595

Additionally, logs from both successful (OK) and failed (KO) executions are attached to help with troubleshooting.

Thanks in advance for any insight or suggestions,
Leonardo Barbieri

❤️
1
👍
1

TestSortCase.sh

TestSortCase.zip
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Sauro Menna - 2025-06-17
  
  Hi Leonardo,
  I am trying to set up a test case and hope to give you feedback shortly.
  Thanks for the details provided, they will be very helpful.
  I will provide feedback on Thursday.
  Regards.
  Sauro
  
  P.S. : Thank you Mickey.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Chuck Haatvedt - 2025-06-17

Sauro, I don't think that this should be allowed. Logically how would you do a merge on intermediate / temporary files if the key fields are being changed during the merge operation. The IBM DFSORT product does not allow this.

In DFSORT, overlapping sort fields are generally allowed, but there are restrictions on SUM fields and their interaction with sort fields. SUM fields cannot overlap sort fields, nor can they overlap each other. Additionally, SUM fields must be numeric and the resulting sums must not overflow the output field. Overlapping fields within the SORT statement itself is permissible, but with limitations on the total length and position within the record.

Here's a breakdown of the concepts:

Sorting Fields (SORT FIELDS):
Overlapping:
DFSORT allows sorting fields to overlap each other, meaning you can sort by multiple fields within the same record, even if those fields share some bytes.
Restrictions:
There are restrictions on the total length of all sort fields. For DFSORT, the maximum combined length is typically 4092 bytes, and all sort fields must be within that limit from the start of the record.
Data Formats:
You can specify various data formats for sorting, such as character (CH), packed decimal (PD), binary (BI), and more.

Summing Fields (SUM FIELDS):
Restrictions:
SUM fields cannot overlap sort fields. They also cannot overlap each other.
Numeric:
SUM fields must be numeric (e.g., packed decimal, binary)

Chuck Haatvedt

👍
1

Last edit: Simon Sobisch 2025-06-17
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Simon Sobisch - 2025-06-17
  
  So GCSORT should directly abort before the start, right?
  
  👍
  1
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Chuck Haatvedt - 2025-06-17

In my opinion, it should report this condition as an error when editing the sort control cards.

If the goal is for GCSORT to behave similar to mainframe sort utilities, then the above should be done.

Chuck Haatvedt
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Mickey White - 2025-06-18
  
  Sounds good, but I don't see how that would happen. I don't recall it happening on the mainframe. If I am on windows and using notepad to edit the sort control file, how is GCSORT going to know that?
  But Good Catch !
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Chuck Haatvedt - 2025-06-17

Also for MFSORT SORT234E SUM FIELD overlaps control field

Last edit: Simon Sobisch 2025-06-17

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

appletonc - 2025-06-18

Perhaps changing the sort field would be a better solution.

USE ./ListRecordDaContare.list ORG SQ RECORD F,8 GIVE ./NumeroRecordContati.txt ORG SQ RECORD F,81 INREC FIELDS=(1,8,72X,X'0A') SORT FIELDS=(9,1,CH,A) SUM FIELDS=(1,8,ZD)

👍
1
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Chuck Haatvedt - 2025-06-18
  
  Yes, that would most likely resolve the issue with this particular sort. However the the underlying issue still remains. I think that a change to GCSORT is required so as to enforce the same restrictions as IBM's DFSORT product.
  
  Summing Fields (SUM FIELDS):
  Restrictions:
  SUM fields cannot overlap sort fields. They also cannot overlap each other.
  Numeric:
  SUM fields must be numeric (e.g., packed decimal, binary)
  
  Chuck Haatvedt
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Sauro Menna - 2025-06-19

Hi,
I found the error. It occurred when, in the presence of temporary files, you use SUM FIELDS. There was an area that was not initialized correctly.
An update of gcsort has been released that contains the resolution to the problem is ver. 1.04.06.

To optimize the execution time, one could replace INREC with OUTREC. This way the execution time is better because the record area matches the key area, so only the output record will have
the length of 81 characters.

use $SORTIN org sq record f,8
give $SORTOUT org sq record f,81
INREC FIELDS=(1,8,72X,X'0A') <----- OUTREC FIELDS=(1,8,72X,X'0A')
SORT FIELDS=(1,8,CH,A)
SUM FIELDS=(1,8,ZD)

I will carefully read your posts on the subject so that I can check for the introduction of more controls in gcsort.

Thank you.
Regards.
Sauro

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Sauro Menna - 2025-06-20

Hi,
I updated the version of gcsort to handle the overlap of SUM FIELDS with control statement and verify that the definition of SUMFIELDS is within the maximum record length.
The new version is 1.04.06b.
Greetings.
Sauro

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Simon Sobisch - 2025-06-20
  
  Sounds nice - what is the result if GCSORT is passed the definition above?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
  - Sauro Menna - 2025-06-20
    
    In the above case:
    
    GCSORT Version 01.04.06b Operation : SORT INPUT FILE : ..\tests\files\ftestcase.dat FIXED (8,8) SQ OUTPUT FILE : ..\tests\files\testcase_gcsmt.srt FIXED (81,81) SQ SORT FIELDS : (1,8,ZD,A) INREC FIELDS : (1,8,72Z,X'0A') SUM FIELDS : (1,8,ZD) *GCSORT*S017X*ERROR: The SUM FIELD overlaps the control field. GCSORT - KO
    
    👍
    1
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    
    Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Chuck Haatvedt - 2025-06-20

I would offer a different opinion, I would prefer that GCSORT follow the restrictions enforced by IBM / SYNCSORT / MICROFOCUS and not allow SUM FIELDS to overlap other fields.

Chuck Haatvedt
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Sauro Menna - 2025-06-20
  
  Hi,
  thanks for the suggestion, I agree.
  I will analyze the controls on command formalism in order to replicate in gcsort the same behavior.
  Regards.
  Sauro
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Simon Sobisch - 2025-08-20

Fixed by [contrib:r1114] and [contrib:r1116] (Version 1.04.06b now include the checks)

Related

Commit: [r1114]
Commit: [r1116]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Issue with GCSORT using temporary files and fixed-length records – v01.04.05...

A free COBOL compiler

Forums

Help

Issue with GCSORT using temporary files and fixed-length records – v01.04.05 on Linux RedHat 8

Related

Issue with GCSORT using temporary files and fixed-length records – v01.04.05...

A free COBOL compiler

Forums

Help

Issue with GCSORT using temporary files and fixed-length records – v01.04.05 on Linux RedHat 8 document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Related

Issue with GCSORT using temporary files and fixed-length records – v01.04.05 on Linux RedHat 8