Stingray - Schema-Based File Reader / Tickets / #22 bdw_iter() consuming 2 records at a time

That indicates a unit testing error.

Here's data used to test a VB file with a BDW and RDW. This is the kind of file that would require a RECFM=VB and would use the bdw_iter() function.

For the iterator to be behaving badly, this file must be improper test data.

A hex dump of the first few bytes of the offending file might provide some insight as to what's wrong with this test case.

class TestEBCDICFile_VariableBlocked( TestEBCDICFile_Fixed ):
    def test_should_get_cells( self ):
        """Data has 4 byte BDW and 4 byte RDW in front of the row."""
        # Build 2 blocks.
        self.data= io.BytesIO( 
        b"\x00\x1d\x00\x00" #BDW
            b"\x00\x19\x00\x00" # RDW
                b"\xe9\xd6\xe2" # WORD="ZOS"
                b"\xf1\xf2\xf3K\xf4\xf5" # NUMBER-1="123.45"
                b"\xf6\xf7\xf8\xf9\xf0" # NUMBER-2="678.90"
                b"\x00\x00\x12\x34" # NUMBER-3=4660
                b"\x98\x76\x5d" # NUMBER-4=-987.65                
        b"\x00\x1d\x00\x00" #BDW
            b"\x00\x19\x00\x00" # RDW
                b"\xe9\xd6\xe2" # WORD="ZOS"
                b"\xf1\xf2\xf3K\xf4\xf5" # NUMBER-1="123.45"
                b"\xf6\xf7\xf8\xf9\xf0" # NUMBER-2="678.90"
                b"\x00\x00\x12\x34" # NUMBER-3=4660
                b"\x98\x76\x5d" # NUMBER-4=-987.65
        )

Last edit: Steven F. Lott 2015-03-01

RHarris - 2015-03-03

Nope, it is a user confusion error.

Looking back on the documentation, it should be emphasized:

The batch size is the number of blocks, not the number of records.

My desire was RECFM_VB output to match the input (thus the use of bdw_iter), but I was thinking of batches in records, not blocks. Examining the hex data to answer your question reveals 2 records per BDW

7fac 0000 3fd4 0000 ... 3fd4 0000 ... 7fac 0000 3fd4 0000 ... 3fd4 0000 ...

My only recommendation at this point would be for all test data to include multiple RDW per BDW since that is real...make sure all test cases still function under those conditions, which I expect they will.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steven F. Lott - 2015-03-03

Test case revised to show multiple records per block.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steven F. Lott - 2015-03-03

status: open --> closed

assigned_to: Steven F. Lott
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bdw_iter() consuming 2 records at a time

Simple handling of numerous data file formats, even COBOL EBCDIC.

Milestone

Searches

Help

#22 bdw_iter() consuming 2 records at a time

Discussion